The team is very international.
The SRE focus is on creating a reliable and available service for customers.
This role holds multiple facets and, dependent on the match with your skills, can entail the following activities:
- Support or perform in depth reviews and analysis on Asset/Service implementations in regard to their monitoring and alerting setup.
- Support or perform in depth reviews and analysis on Asset/Service implementations in regard to their resilience architectural design.
- Provide knowledge sessions about SRE topics, like SLI/SLO definition and setup, how to perform resilience testing, review DR plans, identify toil etc.
- Analyze and help support in Root Cause analysis and Post Mortems, help identify the necessary improvement actions to prevent the incident in the future.
- By (Incident, Problem, Change) data analysis identify structural improvements and advocate them in the organization.
- Provide hands-on support in case application teams are struggling with the implementation of ING monitoring and alerting standards.
- The SRE Expert role is one that requires flexibility and agility with switching technical expertise’s and communicative skills to facilitate the organization best.
Skillset
The client is looking for a broad skillset for an SRE Expert, comprising of both of modern-day and legacy IT stacks. Knowledge of coding, testing and CI/CD standards along with a healthy dose of curiosity is a given in you. You are not afraid of a challenge no matter what to subject is. Impediments are for you there to get solved, neglecting them is a direct crime in your book.
Currently they are looking for someone that feels they fit the above description and has the below technical skill set:
- Extensive knowledge of Linux (RHEL) or Windows.
- Good Knowledge of SQL and familiarity with RDBMS databases (Oracle, MS SQL) or noSQL (cassandra).
- Good knowledge of CI/CD standards (preferably Azure DevOps).
- Extensive knowledge of IT tools to share and collaborate.
- Extensive experience with monolithic application landscape.
- Good knowledge in at least one programming language.
- Good knowledge in Networking and IPV4 and IPV6 stacks.
- Must have experience with monitoring and alerting tools (Prometheus, elk, Grafana etc).
- Knowledge of open telemetry is a bonus.
- Familiarity with Site Reliability Engineering - (SRE) (concept, SLI, SLO, error budgeting, availability reporting).
- Ability to understand Consumers and engineers.
- Problem solving mentality.
- Strong voice and healthy scepticism.
Overall Nice To Have Skills
- Knowledge and experience in using open telemetry.
- Good knowledge of IT security principles.
- Good knowledge of containers and cloud tooling.
Offer: Inititial 12 month contract assignment. Hybrid working (on/off-line in the office) 36/40 hour work week. Full pension according to Dutch temp labor CLA. Bandwidth for a fixed term assignement: € 5.250,00 - € 6.750,00 gross per month, excl. holiday payments. Holiday payments and a 13th-month. Relocation services available for candidates living outside of the Netherlands.