Senior Site Reliability Engineer
Remote in Mexico
Site Reliability Engineering
& 10 others
Mexico
We are looking for a highly skilled Senior Site Reliability Engineer to join our remote team, contributing to a distributed system project that requires a diverse skill set and technical expertise.
As a Senior Site Reliability Engineer, you will focus on understanding how all system components interact to ensure consistent reliability and performance. If you're passionate about site reliability engineering and have a proven track record of excellence, we'd love for you to join us.
Responsibilities
- Analyze and optimize the infrastructure and services supporting the distributed system
- Ensure system reliability and availability through performance monitoring and issue resolution
- Work with cross-functional teams to create solutions that address business and user requirements
- Automate infrastructure deployment and configuration to improve process efficiency
- Conduct code reviews and help establish best practices for site reliability engineering
- Maintain comprehensive documentation for infrastructure and services to facilitate team collaboration
- Explore emerging technologies and advancements to enhance skills and deliver innovative solutions
Requirements
- A minimum of 3 years in Site Reliability Engineering, showcasing a strong background in designing, deploying, and managing large-scale distributed systems
- Proficiency in containerization technologies such as Docker and Kubernetes to ensure scalable service management
- Hands-on skills in using monitoring and logging tools, including Grafana, to track system performance
- Familiarity with cloud platforms such as Microsoft Azure and Google Cloud Platform for cloud-based infrastructure deployment
- Expertise in scripting languages such as PowerShell, Python, and Terraform to automate operational workflows
- Competency in web technologies such as PHP and Angular for developing and maintaining web applications
- Strong collaboration and communication skills, fostering effective teamwork across departments
- Independent, decision-oriented approach with a capacity to take ownership of projects
- Upper-Intermediate or higher proficiency in English for clear and effective communication
Nice to have
- Understanding of JavaScript and Go programming languages
Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn