Machine Learning Engineer Opportunity

ENFINT company

Subscribe to our Telegram & Twitter Channel

Machine Learning Engineer in UNITED ARAB EMIRATES

Visa sponsorship & Relocation 1 day ago

Important 


- ONSITE!!!!! Position in Dubai (4 days per week work from office)

- Fluent in Russian

- English B2 or higher


Job Content


- Design and optimize AI inference pipelines ensuring low-latency, high-throughput model serving for enterprise applications.

- Build and maintain scalable AI infrastructure supporting complex, large-scale workloads efficiently.

- Enable reliable deployment and operation of high-performance AI model serving frameworks across environments.

- Ensure effective GPU resource utilization and cost-efficient AI workload execution.

- Establish comprehensive monitoring and observability for consistent model inference performance.

- Uphold enterprise-grade security, governance, and MLOps best practices throughout the AI delivery lifecycle.


Essential Qualifications


- Bachelor or Equivalent Degree

- 7+ years total engineering or operational experience

- At least 5+ years of relevant experience in a similar role

- Experience within large and complex global enterprises defined by high availability, transaction rates, and geographical distribution


Essential Knowledge & Skills


- Deep Learning Inference: Expertise in TensorRT, vLLM, Triton, FasterTransformer.

- Model Optimization: Experience with ONNX, GGUF, quantization (FP16, INT8, FP8).

- Distributed Systems: Experience with NCCL, MPI, InfiniBand, RDMA, and multi-node GPU workloads.

- Scalable AI Serving: Hands-on experience with Triton Inference Server, vLLM, TensorFlow Serving .

- Profiling & Debugging: Familiarity with nvidia-smi, Nsight, nvprof, TensorRT Profiler.

- Cloud & On-Prem GPU Management: Experience with Kubernetes (K8s), OpenShift, GPU scheduling (Kubeflow, Ray, KServe).

- Understanding of vector databases and their applications in analytics and AI workloads.

- Proficiency in programming languages like Python, Scala, and SQL

- Experience working collaboratively on programming projects and managing the architecture of such projects.

- Advanced skills working in a Linux environment.


Nice to have 


- GPU Programming: Knowledge of CUDA, cuDNN, NCCL, Tensor Cores for optimizing inference.

- Speculative Decoding & FlashAttention for LLM inference.

 - Experience optimizing token streaming for chat applications.

- Experience with vector databases (Qdrant, Milvus) for RAG workloads.


Benefits


 - Opportunity to work on cutting-edge technologies in a highly innovative environment

 - Dynamic and friendly work environment

- Company assistance with relocation expenses 

 - Medical insurance


Apply now

Subscribe our newsletter

New Things Will Always Update Regularly