SysOps Engineer - Monitoring & Cloud Operations

Jobgether • India
Remote
Apply
AI Summary

This SysOps Engineer role focuses on ensuring stability, performance, and resilience of large-scale cloud and hybrid systems through continuous monitoring, incident management, and observability framework design. Key responsibilities include configuring monitoring tools, leading incident response, managing backups and disaster recovery, and collaborating with cross-functional teams to maintain high availability. Candidates must have strong Linux/Windows administration skills, experience with AWS/Azure/GCP, and proficiency with monitoring platforms like New Relic, Prometheus, and Grafana.

Key Highlights
Mission-critical infrastructure operations
Monitoring and observability expertise
Incident management and root cause analysis
Disaster recovery and business continuity
Fully remote work environment
Key Responsibilities
Monitor infrastructure and production systems using observability tools such as New Relic, Prometheus, Grafana, or similar platforms
Configure and maintain alerts, dashboards, and service-level monitoring to proactively detect anomalies
Lead incident management activities including troubleshooting, root cause analysis (RCA), and post-incident reporting
Ensure system uptime, performance, and SLA compliance across cloud and on-premise environments
Manage operating system-level tasks including patching, tuning, and service management
Oversee backup processes and regularly validate restoration procedures
Execute and support disaster recovery plans including failover/failback testing and DR drills
Collaborate with DataOps and infrastructure teams to ensure replication integrity and system resilience
Perform capacity planning, performance optimization, and infrastructure health assessments
Maintain operational documentation including runbooks, monitoring guidelines, and incident playbooks
Technical Skills Required
Linux Windows New Relic Prometheus Grafana AWS Azure Google Cloud Platform Nginx IIS systemd
Benefits & Perks
Competitive compensation package
Fully remote work environment
Flexible arrangements
Professional growth
Collaborative team culture

Job Description


This position is posted by Jobgether on behalf of a partner company. We are currently looking for a SysOps Engineer - Monitoring & Cloud Operations in India.

This role sits at the core of mission-critical infrastructure operations, ensuring the stability, performance, and resilience of large-scale cloud and hybrid systems. You will be responsible for continuously monitoring production environments, identifying and resolving incidents, and maintaining high availability across distributed services. Working within a fast-paced engineering organization, you will collaborate closely with cloud, DevOps, and DataOps teams to safeguard system health and optimize performance. The environment is highly production-driven, requiring strong operational discipline, rapid troubleshooting skills, and a proactive mindset toward risk prevention. You will play a key role in designing and maintaining observability frameworks, ensuring that alerts, dashboards, and monitoring tools provide actionable insights. This is a high-impact position where your work directly supports system uptime, service reliability, and business continuity.

Accountabilities

  • Monitor infrastructure and production systems using observability tools such as New Relic, Prometheus, Grafana, or similar platforms, ensuring full visibility into system health.
  • Configure and maintain alerts, dashboards, and service-level monitoring to proactively detect anomalies and prevent incidents.
  • Lead incident management activities including troubleshooting, root cause analysis (RCA), and post-incident reporting.
  • Ensure system uptime, performance, and SLA compliance across cloud and on-premise environments.
  • Manage operating system-level tasks (Linux and Windows), including patching, tuning, and service management.
  • Oversee backup processes and regularly validate restoration procedures to ensure data reliability.
  • Execute and support disaster recovery (DR) plans, including failover/failback testing and DR drills across environments.
  • Collaborate with DataOps and infrastructure teams to ensure replication integrity, system resilience, and business continuity readiness.
  • Perform capacity planning, performance optimization, and infrastructure health assessments.
  • Maintain operational documentation, including runbooks, monitoring guidelines, and incident playbooks.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, Information Technology, or equivalent practical experience.
  • Proven experience in SysOps, Cloud Operations, SRE, or Infrastructure Support roles in production environments.
  • Strong hands-on experience with Linux and Windows system administration.
  • Experience using monitoring and observability tools such as New Relic, Prometheus, Grafana, Datadog, or equivalent solutions.
  • Solid understanding of incident management, problem management, and root cause analysis methodologies.
  • Experience working with cloud platforms such as AWS, Azure, or Google Cloud Platform.
  • Strong knowledge of disaster recovery, backup strategies, and business continuity planning.
  • Familiarity with infrastructure components such as virtual machines, compute instances, and physical servers.
  • Understanding of web and system services such as Nginx, IIS, and systemd.
  • Strong analytical and troubleshooting skills with the ability to resolve complex production issues under pressure.
  • Excellent communication and collaboration skills for cross-functional coordination.
  • Experience in high-availability, mission-critical environments is highly preferred.

Benefits

  • Competitive compensation package aligned with experience and market standards.
  • Fully remote work environment with flexible arrangements.
  • Opportunity to work on large-scale, mission-critical infrastructure systems.
  • Exposure to modern cloud technologies and advanced observability platforms.
  • Professional growth in a fast-paced, high-impact engineering organization.
  • Collaborative and cross-functional team culture.
  • Involvement in disaster recovery planning, system resilience design, and cloud operations at scale.

How Jobgether Works

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.


Similar Jobs

Explore other opportunities that match your interests

Senior MuleSoft Integrations Support Engineer (Remote)

Devops
•
2d ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

LIXIL

India

ML Platform Engineer (Senior)

Devops
•
2d ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Jobgether

India

Senior AWS Cloud Engineer

Devops
•
3d ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Bright Vision Technologies

India

Subscribe our newsletter

New Things Will Always Update Regularly