Machine Learning Software Engineer Opportunity

DeepRec.ai company

Subscribe to our Telegram Channel

Machine Learning Software Engineer in EUROPEAN UNION

Remote 1 year ago

Machine Learning Software Engineer


Distributed ML Training


Location: Fully Remote

Type: Full-time


Join an innovative Series A Deep Tech company at the forefront of AI and blockchain technology! Backed by top investors with over $50 million in funding, our client is a team of 20 industry experts, looking to grow to 35. They are leveraging blockchain to provide globally accessible computing resources for AI platforms and are seeking world-class engineers to accelerate AI progress. This is a fully remote role offering a high level of autonomy.


Responsibilities:


  • ML Orchestration System Design: Develop systems for orchestrating ML execution across decentralized and heterogeneous infrastructure.
  • Performance Optimization: Profile and optimize training algorithms continually.
  • Implement Novel Research: Build new mechanisms and algorithms to solve unprecedented problems.
  • Engineering Support: Collaborate on broader ML issues, such as reproducible training.
  • Technical Writing and Engagement: Contribute to technical reports and papers, and engage with the community.


Minimum Requirements:


  • Distributed Foundation Model Training: Experience designing or working with training systems on large clusters.
  • Networking Proficiency: Understanding and troubleshooting experience with IP, TCP, UDP, HTTP, and communication backends like NCCL, GLOO, and MPI.
  • Open Source Contributions: Experience with large open-source codebases as a maintainer or trusted contributor.
  • Rust Enthusiasm: Willingness to learn Rust to work across the codebase.
  • Computer Science Background: Solid understanding of computational complexity and broad knowledge of algorithms and data structures.
  • Self-motivation and Communication: Highly self-motivated with excellent verbal and written communication skills.
  • Applied Research Comfort: Comfortable working in a high-autonomy, unpredictable applied research environment.


Bonus Skills:


  • Rust Expertise: Strong experience with systems programming in Rust, understanding lifetimes, and the purpose of Pin.
  • Research Experience: Published research in distributed systems or ML domains.
  • Blockchain Knowledge: Understanding of blockchain fundamentals.



Be part of a team dedicated to democratizing AI, where you can leverage your expertise in distributed ML training, networking, and open-source contributions to make a significant impact. Embrace autonomy, continuous learning, and the drive to push innovative solutions in a highly collaborative and flexible environment.


Apply now to join this cutting-edge team and contribute to the future of AI!

Apply now

Subscribe our newsletter

New Things Will Always Update Regularly