Why Work at Lenovo
Description and Requirements
The GPU/AI Product Engineer provides engineering support to the field, interfacing with Level 1 / 2 /3 to resolve the more complex and/or urgent customer escalations. This person will work directly with development, quality and/or manufacturing engineers to isolate and help perform root cause analysis of defect issues. Responsibilities will also include providing field communications such as Service Tips and fix releases, drives field action plans to address design and/or quality issues, manages the technical resolution plan for pervasive field issues, and drives lessons learned through a closed loop continuous improvement process. Product Engineering support begins just prior to product introduction by participating in the Ship Support sign-off and continues through the product life cycle until end-of-life.
Key Activities:
- Interface with customers, Level 1 /2 /3 technical support, and internal Sales teams to drive technical resolution plans for escalated field issues, with key focus on delivering a positive customer experience.
- Perform problem determination and defect root cause isolation by analyzing hardware diagnostic & OS logs.
- Derive technical action plans with urgency to restore production for outages or severe customer-impacting situations.
- Setup systems with customer configuration in the lab to replicate reported field failures.
- Instrument lab debug equipment (e.g. oscilloscope, analyzers) to isolate and debug complex failures.
- Develop field action plans to address design/quality issues which have affected customers including: Stop ships, field communications (Service Tips and Notable issues), Engineering Change (EC) generation, and rolling Field Replaceable Unit (FRU) stock when necessary
- Travel to provide on-site assistance to resolve critical situations (when required).
- Participate in pre-product release activities; take technical positions for defect deferrals and HW design changes and approve product readiness for ship support.
- Maintain key data for products assigned (specifications, publications, Tech Tips, defects, engineering changes and history, product roadmaps, and lifecycle management).
- Conduct lessons learned reviews in a quality closed loop process as part of driving continuous improvement of products and processes.
- Bachelor’s Degree in Technical Discipline: Engineering, Computer Science or related field
- 7+ years hands-on experience with Intel x86 or AMD servers
- 5+ years hands-on experience with GPU technology
- 12+ years of experience in a technical engineering role doing hardware problem determination and troubleshooting
- Experience with Linux based Operating Systems (Ubuntu, RedHat, SuSE)
- 7+ years of client-facing experience
Preferred Skills:
- Experience with Clustered solutions
- Strong analytical and problem solving skills
- Python programming skills
- Bash scripting experience
- Familiar with Datacenter GPU software stack such as AMD ROCm or Nvidia CUDA and GPU benchmarks such as Nvqual, mlperf, Tensorflow or NCCL
- Familiar with AI / Machine Learning workloads, frameworks, and models