General Information

Req #
WD00066353
Career area:
Research/Development
Country/Region:
China
State:
Beijing
City:
北京(Beijing)
Date:
Wednesday, June 5, 2024
Working time:
Full-time
Additional Locations
* China - Beijing - 北京(Beijing)

Why Work at Lenovo

 We are Lenovo. We do what we say. We own what we do. We WOW our customers. 

Lenovo is a US$62 billion revenue global technology powerhouse, ranked #217 in the Fortune Global 500, employing 77,000 people around the world, and serving millions of customers every day in 180 markets. Focused on a bold vision to deliver smarter technology for all, Lenovo has built on its success as the world’s largest PC company by further expanding into growth areas that fuel the advancement of ‘New IT’ technologies (client, edge, cloud, network, and intelligence) including server, storage, mobile, software, solutions, and services. 

This transformation together with Lenovo’s world-changing innovation is building a more inclusive, trustworthy, and smarter future for everyone, everywhere. To find out more visit www.lenovo.com, and read about the latest news via our StoryHub

Description and Requirements

岗位职责

1. 负责大模型训练资源调度,在异构集群上完成大模型的资源自动配置和自动并行

2. 设计大模型并行策略性能仿真软件,支持混合异构芯片进行大模型训练


岗位要求:

1. 全日制硕士以上学历,计算机科学与技术、人工智能等相关专业;

2. 熟练C++/Python语言、数据结构以及计算机系统结构,有AI模型性能调优经验,以及良好的工程实现能力;

3. 具备基础的GPU编程能力(CUDA / ROCm),熟悉常用的AI加速库,如NCCL/oneAPI/cudnn等;

4. 至少熟悉一种常用的深度学习框架(PyTorch/TensorFlow/Paddle/DeepSpeed)

5. 熟悉大模型3D并行策略的原理,以及算子计算和通信开销分析手段;

6. 熟悉深度学习网络和算子底层实现细节,有模型推理或者训练调优经验.


加分项:

1. 大模型研发和分布式训练经验

2. 熟悉Kubernetes架构以及大模型训练调度系统

3. 大模型3D并行策略实现或者优化经验

4. AI或者HPC领域发表过高水平论文

Additional Locations
* China - Beijing - 北京(Beijing)
* China
* China - Beijing
* China - Beijing - 北京(Beijing)