Executive Recruitment Firm Monroe Consulting is currently partnered with a growing IT Infrastructure provider , looking for a Site Reliability Engineer to join their growing team. This role will include exciting projects, great benefits and collaborative working culture.
Job Responsibility:
- Understanding current infra and services that running on public clouds and on premise
- Understanding documents provided by teams and use the documents as a guideline for troubleshooting
- Ensure technical documents for new and existing processes are up to date
- Provide good observability on all our system and infra
- Maintain and develop alert, monitoring tools and monitoring dashboard
- Closely monitor our system infrastructure using Grafana and related tools
- Response to system, infra and services related support case
- Identify and perform troubleshooting whenever issue is reported
- Perform upgrade, patches to our infrastructure using CI tools
- Perform preventive maintenance to our system and infra
Job Requirements:
- Having 1+ years in DevOps / SRE / Network engineer / Operation engineer related field is required
- Familiar with Linux and networking related skills
- Able to work and solve problems independently when required
- Collaborate with various teams to resolve issues to ensure minimum service downtime
- Willing to learn new technology, such as Kubernetes, Terraform, Gitlab CI/CD, ArgoCD etc
- Able to work in shift including midnight
- Having hands-on experience with bash script
Nice to have
- CCNA certifications or any related certifications
- AWS certifications or any related certifications
- Programming experience (Golang, Python, Bash Shell, etc)
- Having hands-on knowledge in cloud computing, Kubernetes, CICD etc
- Having hands-on knowledge in Terraform and Ansible related skills
- Able to understand underlying source code