Research Engineer - Machine Learning

mistral United State
Visa Sponsorship
Apply
AI Summary

Join our dynamic team as a Research Engineer - Machine Learning to build and optimize large-scale learning systems that power our open-weight models. You will work with Research Scientists to enhance the shared training framework, data pipelines, and cluster tooling used by every team. This role is ideal for those with a strong background in machine learning and experience with PyTorch, JAX, or TensorFlow.

Key Highlights
Build and optimize large-scale learning systems
Work with Research Scientists to enhance training framework and data pipelines
Design and implement ML algorithms in Python
Key Responsibilities
Accelerate researchers by taking on the heavy parts of large-scale ML pipelines and building robust tools
Interface cutting-edge research with production: integrate checkpoints, streamline evaluation, and expose APIs
Conduct experiments on the latest deep-learning techniques (sparsified 70 B + runs, distributed training on thousands of GPUs)
Design, implement and benchmark ML algorithms; write clear, efficient code in Python
Deliver prototypes that become production-grade components for Le Chat and our enterprise API
Technical Skills Required
PyTorch JAX TensorFlow Python DeepSpeed FSDP SLURM K8s CUDA Data Pipelines
Benefits & Perks
Competitive salary and equity
Healthcare: Medical/Dental/Vision covered for you and your family
Pension: 401K (6% matching)
PTO: 18 days
Transportation: Reimburse office parking charges, or $120/month for public transport
Sport: $120/month reimbursement for gym membership
Meal stipend: $400 monthly allowance for meals
Visa sponsorship
Coaching: we offer BetterUp coaching on a voluntary basis
Nice to Have
Experience in deep learning, NLP or LLMs; bonus for CUDA or data-pipeline chops

Job Description


About Mistral

At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.

We democratize AI through high-performance, optimized, open-source and cutting-edge models, products and solutions. Our comprehensive AI platform is designed to meet enterprise as well as personal needs. Our offerings include Le Chat, La Plateforme, Mistral Code and Mistral Compute - a suite that brings frontier intelligence to end-users.

We are a dynamic, collaborative team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation. Our teams are distributed between France, USA, UK, Germany and Singapore. We are creative, low-ego and team-spirited.

Join us to be part of a pioneering company shaping the future of AI. Together, we can make a meaningful impact. See more about our culture on https://mistral.ai/careers.

Role Summary

About The Research Engineering Team

The team spans Platform (shared infra & clean code) and Embedded (inside research squads). Engineers can move along the research↔production spectrum as needs or interests evolve.

As a Research Engineer – ML track, you’ll build and optimise the large-scale learning systems that power our open-weight models. Working hand-in-hand with Research Scientists, you’ll either join:

  • Platform RE Team: Enhance the shared training framework, data pipelines and cluster tooling used by every team; or
  • Embedded RE Team: Sit inside a research squad (Alignment, Pre-training, Multimodal, …) and turn fresh ideas into repeatable, scalable code


What will you do

  • Accelerate researchers by taking on the heavy parts of large-scale ML pipelines and building robust tools
  • Interface cutting-edge research with production: integrate checkpoints, streamline evaluation, and expose APIs
  • Conduct experiments on the latest deep-learning techniques (sparsified 70 B + runs, distributed training on thousands of GPUs)
  • Design, implement and benchmark ML algorithms; write clear, efficient code in Python
  • Deliver prototypes that become production-grade components for Le Chat and our enterprise API


About You

  • Master’s or PhD in Computer Science (or equivalent proven track record)
  • 4 + years working on large-scale ML codebases
  • Hands-on with PyTorch, JAX or TensorFlow; comfortable with distributed training (DeepSpeed / FSDP / SLURM / K8s)
  • Experience in deep learning, NLP or LLMs; bonus for CUDA or data-pipeline chops
  • Strong software-design instincts: testing, code review, CI/CD
  • Self-starter, low-ego, collaborative


What We Offer

  • 💰 Competitive salary and equity
  • 🚑 Healthcare: Medical/Dental/Vision covered for you and your family
  • 👴🏻 Pension : 401K (6% matching)
  • 🏝️ PTO : 18 days
  • 🚗 Transportation: Reimburse office parking charges, or $120/month for public transport
  • 🏀 Sport: $120/month reimbursement for gym membership
  • 🥕 Meal stipend: $400 monthly allowance for meals (solution might evolve as we grow bigger)
  • 🌎 Visa sponsorship
  • 🤝 Coaching: we offer BetterUp coaching on a voluntary basis


By applying, you agree to our Applicant Privacy Policy.

Similar Jobs

Explore other opportunities that match your interests

Applied AI Engineer

Machine Learning
4h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Not Applicable

mistral

United State

Founding AI/ML Engineer

Machine Learning
4d ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Not Applicable

david joseph & company

United State
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Stealth Startup

United State

Subscribe our newsletter

New Things Will Always Update Regularly