Site Reliability Engineer
Do you get energy from solving complex and technical IT challenges within an in-house software development company? We are looking for someone with a passion for reliability, performance and availability. Are you the person that brings our systems and applications to the next level?
We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our IT Operations team. As an SRE, you will be responsible for ensuring the reliability, availability, and performance of our systems and applications. You will collaborate closely with our development teams, system engineers, and other stakeholders to build and maintain robust and scalable infrastructure.
- Design, implement, and maintain highly available and scalable infrastructure solutions.
- Collaborate with software development teams to ensure proper integration of software and infrastructure components.
- Monitor system performance and proactively identify and resolve any issues or bottlenecks.
- Implement and enhance automation tools for deployment, configuration management, and monitoring.
- Conduct incident response and root cause analysis to prevent system failures from recurring.
- Optimize system performance through capacity planning, performance tuning, and resource utilization analysis.
- At least 3 years of experience in maintaining and securing cloud-based infrastructure
- Automation skills in the context of IaC
- Understanding of cloud infrastructure technologies (Azure) and containerization (Docker, Kubernetes)
- Solid understanding of networking concepts and protocols
- Windows server 2016/2019, Office 365
- Microsoft SQL server 2017/2019
- Familiar with monitoring and logging (e.g. ELK stack)
- Knowledge of Python, Java, and/or other programming language would be a plus