Site Reliability Engineer (SRE) Opportunity

hackajob company

Subscribe to our Telegram & Twitter Channel

Site Reliability Engineer (SRE) in United State

Remote 9 hours ago

hackajob is collaborating with Leo Technologies to connect them with exceptional tech professionals for this role.

We are looking for a seasoned Site Reliability Engineer (SRE) to join our distributed team. This is a fully remote, work-from-home opportunity.

As a key member of our DevOps team, you will be responsible for designing, implementing, and maintaining mission-critical monitoring, alerting, and incident response systems. Your work will ensure high availability, reliability, and performance of our infrastructure, supporting scalable services in production environments.

You will partner closely with engineering teams throughout the full development lifecycle, contributing to planning, design, deployment, and reliability goals.

What We Value

  • Strong engineering background in fields such as Computer Science, Software Engineering and Mathematics.
  • At least 6+ years of devops or site reliability experience.
  • Deep understanding of distributed systems, containerization (e.g., Docker, Kubernetes), and modern infrastructure design patterns.
  • Adept experience authoring infrastructure as code using Terraform and/or Ansible.
  • Experience with monitoring, logging, and alerting using tools like Grafana, Prometheus, ELK stack, or equivalents.
  • A deep understanding of public cloud infrastructure.
  • Proficiency with programming languages such as Python, Go, or similar languages.
  • Experience with PostgreSQL, ElasticSearch and KV stores.
  • Skill and comfort working in a fast-paced environment with dynamic objectives and quick iterations.
  • Demonstrated ability to learn continuously, work independently, and make decisions with minimal supervision.

Technologies We Use

  • Primarily hosted on AWS Cloud with some infrastructure in Azure.
  • An extensive monitoring and alerting footprint in Grafana Cloud.
  • Our backend services are primarily dockerized, deployed in Kubernetes, and managed by ArgoCD.
  • Our backend languages primarily consist of Elixir, NodeJS, and Python.
  • TypeScript and React are central to our front-end development.
  • Terraform, CloudFormation, Ansible are leveraged for our Infrastructure deployment and automation.
  • Industry-standard build tooling and CI/CD using GitHub Actions.
  • A mix of open-source and proprietary technologies that are tailored to the problems at hand.

Apply now

Subscribe our newsletter

New Things Will Always Update Regularly