Site Reliability Specialist (Remote)

hire feed • United Arab Emirates
Remote
Apply
AI Summary

We are hiring a Site Reliability Specialist to work on a contractor basis, helping to train next-generation AI systems. The role involves leading deployment, monitoring, and recovery of complex AI training environments. Domain knowledge is key to success in this role.

Key Highlights
Lead deployment, monitoring, and recovery of complex AI training environments
Proactively identify and resolve infrastructure bottlenecks and failures
Collaborate with cross-functional teams to improve system architecture and process
Key Responsibilities
Lead the deployment, monitoring, and recovery of complex, containerized AI training environments
Proactively identify, diagnose, and resolve infrastructure bottlenecks and failures in long-running processes
Orchestrate resilient system builds and infrastructure management
Technical Skills Required
Linux Containerized environments Python Terminal-native problem-solving skills
Benefits & Perks
$40-$70/hour
Remote work
Equal Opportunity Employer

Job Description


  • Role: Site Reliability Specialist (Remote)
  • Location: Remote (Work from Anywhere)
  • Payout: $40-$70/hour


Role Overview:

We are hiring for one of our clients, seeking a Site Reliability Engineer to work on a contractor basis. This Site Reliability Engineer will apply their expertise to help train next-generation AI systems, shaping how models learn, reason, and perform through high-quality, real-world input. With no prior experience in AI required, domain knowledge is the key to success in this role. The client is a leader in the AI industry, leveraging their platform to connect domain experts with the development of frontier AI models.


Key Responsibilities:

• Lead the deployment, monitoring, and recovery of complex, containerized AI training environments using advanced terminal techniques, ensuring stability and optimal resource utilization.

• Proactively identify, diagnose, and resolve infrastructure bottlenecks and failures in long-running processes, minimizing downtime and ensuring business continuity.

• Orchestrate resilient system builds and infrastructure management, collaborating closely with engineering teams to refine CI/CD pipelines and automate routine operational tasks.

• Collaborate with cross-functional teams to identify and prioritize improvements to system architecture, infrastructure, and process, driving continuous growth and improvement.

• Manage and optimize filesystem structure to ensure efficient data storage and retrieval, reducing latency and improving overall system performance.


Required Skills & Qualifications:

• Terminal-native problem-solving skills, with a strong understanding of Linux and containerized environments.

• Dynamic infrastructure recovery and containerized environment mastery, with experience in deploying and managing complex systems.

• Proficiency in Python, with a strong understanding of software development and testing principles.

• Strong collaboration and communication skills, with experience working with cross-functional teams to drive business outcomes.

• Ability to adapt to changing priorities and requirements, with a strong focus on delivering high-quality results under tight deadlines.


More About the Opportunity:

This role offers a unique opportunity to work with a global leader in the AI industry, leveraging their platform to connect domain experts with the development of frontier AI models. With a focus on continuous growth and improvement, this role will challenge you to think critically and creatively, driving innovation and excellence in the field of AI systems.


Equal Opportunity Employer:

We hire based on skills and expertise. All qualified candidates are welcome regardless of background, experience, or prior employment history. Applications are reviewed solely on demonstrated technical ability and qualifications.


Apply Now!


Similar Jobs

Explore other opportunities that match your interests

Backend Engineer, Digital Finance

Programming
•
10h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

talentmate

United Arab Emirates

AI Accelerator Lead Builder

Programming
•
3d ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

noon

United Arab Emirates

Frontend Engineer (Contract, Remote)

Programming
•
4d ago
Visa Sponsorship Relocation Remote
Job Type Contract
Experience Level Not Applicable

hire feed

United Arab Emirates

Subscribe our newsletter

New Things Will Always Update Regularly