Site Reliability Engineer

Executive Recruitment Firm Monroe Consulting is currently partnered with a growing IT Infrastructure provider , looking for a Site Reliability Engineer to join their growing team. This role will include exciting projects, great benefits and collaborative working culture.

Job Responsibility:

  • Understanding current infra and services that running on public clouds and on premise
  • Understanding documents provided by teams and use the documents as a guideline for troubleshooting
  • Ensure technical documents for new and existing processes are up to date
  • Provide good observability on all our system and infra
  • Maintain and develop alert, monitoring tools and monitoring dashboard
  • Closely monitor our system infrastructure using Grafana and related tools
  • Response to system, infra and services related support case
  • Identify and perform troubleshooting whenever issue is reported
  • Perform upgrade, patches to our infrastructure using CI tools
  • Perform preventive maintenance to our system and infra


Job Requirements:

  • Having 1+ years in DevOps / SRE / Network engineer / Operation engineer related field is required
  • Familiar with Linux and networking related skills
  • Able to work and solve problems independently when required
  • Collaborate with various teams to resolve issues to ensure minimum service downtime
  • Willing to learn new technology, such as Kubernetes, Terraform, Gitlab CI/CD, ArgoCD etc
  • Able to work in shift including midnight
  • Having hands-on experience with bash script


Nice to have

  • CCNA certifications or any related certifications
  • AWS certifications or any related certifications
  • Programming experience (Golang, Python, Bash Shell, etc)
  • Having hands-on knowledge in cloud computing, Kubernetes, CICD etc
  • Having hands-on knowledge in Terraform and Ansible related skills
  • Able to understand underlying source code