Lead 3rd Line/Software Maintenance Engineer
Remote in Chile
Amazon Web Services
& 16 others

Sorry, this position is no longer available
Chile
We are looking for a Lead 3rd Line/Software Maintenance Engineer to join us remotely. Your main task will be to work on the development of a state-of-the-art infrastructure monitoring platform for our varied clientele. As a vital part of our team, you will design, implement, and maintain an observability platform, ensuring an optimal user experience for our clients scattered across the globe. Collaborating with our worldwide team of engineers, you will employ your technical skills and understanding of Amazon Web Services to construct inventive fixes.
Responsibilities
- Work closely with the team to design, implement, and upkeep the observability platform, ensuring a superior user experience for our global clients
- Develop and maintain monitoring responses for diverse infrastructure components utilizing Amazon Web Services and other monitoring tools
- Develop and manage infrastructure as code leveraging Terraform
- Configure and preserve Kubernetes clusters for various applications
- Partner with other teams to find and resolve any issues linked to the observability platform
- Create and upkeep documentation related to the observability platform and its elements
- Keep abreast with the most recent technologies and best practices prevalent in the observability field
Requirements
- A minimum of 5 years' experience in roles such as systems engineering, DevOps, system administration, or cloud engineering
- Expertise in Amazon Web Services with a concentration on infrastructure monitoring
- Solid command over Bash, Docker, Kubernetes, Linux, and Terraform
- A good understanding of network basics like routing, DNS, and firewalls
- Proficiency in Git for version control
- At least a year in a leadership role
Nice to have
- Some familiarity with scripting languages such as Python and Go Language for automation
- Familiarity with Grafana and Honeycomb for visualization and tracing
- Experience with New Relic for application performance monitoring
- Experience with OpenTelemetry for distributed tracing
- Knowledge of Sumologic
- Knowledge of Prometheus for monitoring and alerting
Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn