Machine Learning Engineer

ENFINT • United Arab Emirates

Relocation

This Job is No Longer Active This position is no longer accepting applications

Job Description

Important

- ONSITE!!!!! Position in Dubai (4 days per week work from office)

- Fluent in Russian

- English B2 or higher

Job Content

- Design and optimize AI inference pipelines ensuring low-latency, high-throughput model serving for enterprise applications.

- Build and maintain scalable AI infrastructure supporting complex, large-scale workloads efficiently.

- Enable reliable deployment and operation of high-performance AI model serving frameworks across environments.

- Ensure effective GPU resource utilization and cost-efficient AI workload execution.

- Establish comprehensive monitoring and observability for consistent model inference performance.

- Uphold enterprise-grade security, governance, and MLOps best practices throughout the AI delivery lifecycle.

Essential Qualifications

- Bachelor or Equivalent Degree

- 7+ years total engineering or operational experience

- At least 5+ years of relevant experience in a similar role

- Experience within large and complex global enterprises defined by high availability, transaction rates, and geographical distribution

Essential Knowledge & Skills

- Deep Learning Inference: Expertise in TensorRT, vLLM, Triton, FasterTransformer.

- Model Optimization: Experience with ONNX, GGUF, quantization (FP16, INT8, FP8).

- Distributed Systems: Experience with NCCL, MPI, InfiniBand, RDMA, and multi-node GPU workloads.

- Scalable AI Serving: Hands-on experience with Triton Inference Server, vLLM, TensorFlow Serving .

- Profiling & Debugging: Familiarity with nvidia-smi, Nsight, nvprof, TensorRT Profiler.

- Cloud & On-Prem GPU Management: Experience with Kubernetes (K8s), OpenShift, GPU scheduling (Kubeflow, Ray, KServe).

- Understanding of vector databases and their applications in analytics and AI workloads.

- Proficiency in programming languages like Python, Scala, and SQL

- Experience working collaboratively on programming projects and managing the architecture of such projects.

- Advanced skills working in a Linux environment.

Nice to have

- GPU Programming: Knowledge of CUDA, cuDNN, NCCL, Tensor Cores for optimizing inference.

- Speculative Decoding & FlashAttention for LLM inference.

- Experience optimizing token streaming for chat applications.

- Experience with vector databases (Qdrant, Milvus) for RAG workloads.

Benefits

- Opportunity to work on cutting-edge technologies in a highly innovative environment

- Dynamic and friendly work environment

- Company assistance with relocation expenses

- Medical insurance

Job Overview

Posted Date Oct 15, 2025

Employment Type Full-time

Experience Level Mid-Senior level

Location United Arab Emirates

Category Machine Learning

Company ENFINT

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Senior Machine Learning Engineer (MLOps & LLMs)

Machine Learning

•

3w ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

selected recruitment

United Arab Emirates