MLOps Platform Developer + flight once a month + Monday - Friday Opportunity

wizedom company

Subscribe to our Telegram Channel

MLOps Platform Developer + flight once a month + Monday - Friday in ISRAEL

Remote 4 months ago

Cutting-edge AI startup seeking a Lead MLOps Platform Developer for our global foundation team.

Key Responsibilities:

Develop and manage scalable, automated machine learning pipelines, CI/CD workflows, and orchestration frameworks.

Develop scalable cost-effective distributed training environments that utilize multi-GPU clusters and parallel processing strategies across cloud environments

Develop scalable inference architectures optimized, with ultra-low latency and high throughput.

Ensure seamless model deployment by implementing A/B testing, canary releases, and rollback capabilities

Develop logging, alerting, and monitoring solutions to track model development, and reliability.

Improve GPU usage, enable autoscaling, and streamline resource allocation to boost efficiency.

Design, implement, and maintain feature stores, robust data pipelines, and scalable storage solutions to efficiently handle large volumes of data.


Required Qualifications:

5+ years of experience as MLOps engineer or DevOps roles, working with MLOps platforms (MLflow, WandB etc..) and frameworks (PyTorch, TensorFlow etc..)..

Experience building and designing MLOps infrastructure from the ground up.

Proven ability to deploy machine learning models into a production environment, ensuring high scalability and low latency inference.

Experience in building and managing data pipelines to support both model training and inference.

Experience with Kubernetes on a major cloud provider (AWS, GCP, or Azure) and with infrastructure as code (e.g. Terraform, Helm, GitOps).

Strong software engineering skills in Python, Bash, and Go, with a focus on writing clean, maintainable, and scalable code.

Experience in AI/ML systems security, compliance, and model governance.

Proficient with observability and monitoring tools, such as Prometheus, Grafana, Datadog, and OpenTelemetry.

Fully remote. Travel to Paris for 4 days each month in the foundation stage.

Working days - Monday to Friday.

Apply now

Subscribe our newsletter

New Things Will Always Update Regularly