Skip To Main Content
backBack to Search

Senior Site Reliability Engineer

Remote in Mexico
Site Reliability Engineering
& 10 others

We are looking for a talented Senior Site Reliability Engineer to join our remote team, contributing to a distributed system project that demands proficiency across a diverse set of tools and skills.

In this role, you will analyze and understand how all parts of the system work together to ensure optimal reliability and performance. If you are passionate about site reliability engineering and have a strong record of achievements, we encourage you to join our team.

Responsibilities
  • Design and maintain infrastructure and services supporting the distributed system
  • Monitor system performance and resolve issues to ensure consistent reliability and availability
  • Collaborate with cross-functional teams to create and implement solutions aligned with both business and user needs
  • Automate deployment and configuration of infrastructure to enhance operational efficiency
  • Review code and contribute to the development of standards and practices in site reliability engineering
  • Develop and update documentation to support knowledge sharing and alignment across the team
  • Stay informed on emerging site reliability engineering technologies and trends to refine skills and expertise
Requirements
  • At least 3 years of experience in Site Reliability Engineering, demonstrating expertise in managing large-scale distributed systems
  • Proficiency in Docker and Kubernetes, leveraging these technologies to deploy and manage scalable, reliable services
  • Skills in using monitoring and logging tools like Grafana
  • Familiarity with cloud platforms such as Microsoft Azure or Google Cloud Platform to design and deploy cloud-based infrastructure
  • Strong scripting capabilities in PowerShell, Python, and Terraform to enable automation of infrastructure deployment processes
  • Understanding of web technologies such as PHP and Angular to maintain and support web application development
  • Effective communication and collaboration skills to ensure seamless teamwork with cross-functional teams
  • Autonomous decision-making capabilities to take ownership and drive projects successfully
  • Upper-Intermediate or higher fluency in spoken and written English for clear communication
Nice to have
  • Background in JavaScript and the Go programming language
Benefits
  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn