Senior ML GPU Optimization Engineer Opportunity

Luxoft company

Subscribe to our Telegram & Twitter Channel

Senior ML GPU Optimization Engineer in POLAND

Visa sponsorship & Relocation 1 year ago

Would You like to become a part of Luxoft Team? If yes, this opportunity might be for You!


What do we offer our Employees?

 Online recruitment process and onboarding trainings

 LuxMed health & dental care, life insurance

 MyBenefit program (sports card, well-being program etc.)

 Relocation support with Family coverage

 Equipment such as a laptop and monitor

 Corrective Glasses reimbursement

 Special offer of Banking Services

 Preferential Car Leasing offer

 Paid Referrals

 LuxTalent platform (webinars, training, courses with certificates)

@gsikora


Project Description:

Our R&D team is focused on creating the most effective engine for deploying generative AI models, with efforts ranging from precise GPU kernel fine-tuning to comprehensive system optimizations.

We're looking for an expert level engineer with a strong background in either CUDA, ROCm, or Triton kernel optimization. Your role will involve leading substantial improvements in GPU performance and playing a key role in pioneering AI and machine learning initiatives.


Responsibilities:

● Explore and analyze performance bottlenecks in ML training and inference.

● Develop and optimize high-performance computing kernels in Triton, CUDA, and/or ROCm.

● Implement programming solutions in C/C++ and Python.

● Deep dive into GPU performance optimizations to maximize efficiency and speed.

● Collaborate with the team to extend and improve existing machine learning compilers or frameworks such as MLIR, Pytorch, Tensorflow, ONNX Runtime, TensorRT. (This is optional but beneficial)


Mandatory Skills Description:

● Bachelor's, Master’s or PhD’s degree in Computer Science, Electrical

Engineering, or a related field.

● Strong programming skills in C/C++.

● Deep understanding and experience in GPU performance optimizations.

● Proven experience with kernel optimizations on CUDA, ROCm, or other

accelerators.


Nice-to-Have Skills Description:

● General experience with the training and deployment of ML models

● Experience with distributed systems development or distributed ML workloads

● Good programming skills in Python.

● Experience with innovative OSS projects like

FlashAttention, mlc-llm, vllm.

● Experience with machine learning compilers or

frameworks such as TVM, MLIR, Pytorch, Tensorflow, ONNX Runtime, TensorRT.


Languages:

English: C1 Advanced

Apply now

Subscribe our newsletter

New Things Will Always Update Regularly