General Information

Req #
WD00066364
Career area:
Research/Development
Country/Region:
China
State:
Beijing
City:
北京(Beijing)
Date:
Wednesday, June 5, 2024
Working time:
Full-time
Additional Locations
* China - Beijing - 北京(Beijing)

Why Work at Lenovo

 We are Lenovo. We do what we say. We own what we do. We WOW our customers. 

Lenovo is a US$62 billion revenue global technology powerhouse, ranked #217 in the Fortune Global 500, employing 77,000 people around the world, and serving millions of customers every day in 180 markets. Focused on a bold vision to deliver smarter technology for all, Lenovo has built on its success as the world’s largest PC company by further expanding into growth areas that fuel the advancement of ‘New IT’ technologies (client, edge, cloud, network, and intelligence) including server, storage, mobile, software, solutions, and services. 

This transformation together with Lenovo’s world-changing innovation is building a more inclusive, trustworthy, and smarter future for everyone, everywhere. To find out more visit www.lenovo.com, and read about the latest news via our StoryHub

Description and Requirements

岗位职责:

1、负责分布式AI训练和推理系统的大规模互连网络架构设计;

2、负责大规模AI互连网络的仿真场景、性能评价指标设计;

3、负责大规模AI互连网络仿真系统的搭建、测试和验证工作

 

岗位要求:

1、全日制硕士以上学历,计算机科学与技术,通信工程,软件工程等相关专业;

2、熟练掌握C/C++编程语言语言、数据结构以及计算机系统结构,有良好的工程实现能力;

3、熟悉NS3、OMNeT++、OPNET、GEM5、MATLAB等仿真器,具有基于上述仿真器的工程开发经验;

4、熟悉分布式AI系统的网络互连拓扑和网络协议标准,了解TCP/IP和RDMA网络协议,了解NCCL集合通信库,熟悉NCCL,Socket和IB verbs编程


加分项:

1、了解数据中心网络架构,有数据中心网络拥塞控制和负载均衡工程经验优先;

2、在网络系统或AI/HPC系统领域发表过高水平学术论文;

3、熟悉Megatron-LM、DeepSpeed、Colossal-AI等至少一种模型训练框架,并能基于框架进行二次开发和优化

Additional Locations
* China - Beijing - 北京(Beijing)
* China
* China - Beijing
* China - Beijing - 北京(Beijing)