Skip To Main Content
backBack to Search

Senior Site Reliability Engineer

Remote in Mexico
Site Reliability Engineering
& 10 others

We are looking for a highly skilled Senior Site Reliability Engineer to join our remote team, contributing to a distributed system project that requires a diverse skill set and technical expertise.

As a Senior Site Reliability Engineer, you will focus on understanding how all system components interact to ensure consistent reliability and performance. If you're passionate about site reliability engineering and have a proven track record of excellence, we'd love for you to join us.

Responsibilities
  • Analyze and optimize the infrastructure and services supporting the distributed system
  • Ensure system reliability and availability through performance monitoring and issue resolution
  • Work with cross-functional teams to create solutions that address business and user requirements
  • Automate infrastructure deployment and configuration to improve process efficiency
  • Conduct code reviews and help establish best practices for site reliability engineering
  • Maintain comprehensive documentation for infrastructure and services to facilitate team collaboration
  • Explore emerging technologies and advancements to enhance skills and deliver innovative solutions
Requirements
  • A minimum of 3 years in Site Reliability Engineering, showcasing a strong background in designing, deploying, and managing large-scale distributed systems
  • Proficiency in containerization technologies such as Docker and Kubernetes to ensure scalable service management
  • Hands-on skills in using monitoring and logging tools, including Grafana, to track system performance
  • Familiarity with cloud platforms such as Microsoft Azure and Google Cloud Platform for cloud-based infrastructure deployment
  • Expertise in scripting languages such as PowerShell, Python, and Terraform to automate operational workflows
  • Competency in web technologies such as PHP and Angular for developing and maintaining web applications
  • Strong collaboration and communication skills, fostering effective teamwork across departments
  • Independent, decision-oriented approach with a capacity to take ownership of projects
  • Upper-Intermediate or higher proficiency in English for clear and effective communication
Nice to have
  • Understanding of JavaScript and Go programming languages
Benefits
  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn