Middle AWS DevOps Engineer
We are looking for a Middle DevOps Engineer to join our remote team and assist us in managing AWS infrastructure, observability services, and automation of operations.
As a DevOps Engineer, you will be responsible for managing AWS infrastructure using Terraform/CloudFormation, setting up and modernizing observability services, and programmatically automating operations using Python/Golang and Gitlab CI. You will also be responsible for building Docker images for multiple architectures, and troubleshooting issues involving microservices in Kubernetes, AWS connectivity, services performance, Lambda functions, and Kafka. If you are passionate about DevOps and possess a strong understanding of AWS, Kubernetes, and automation, we would love to hear from you.
- Manage AWS infrastructure using Terraform/CloudFormation (e.g. EKS version upgrades, blue/green deployments, scaling, right-sizing)
- Set up/tune/modernize/migrate/decommission various observability services that we provide (Cortex/Mimir, Loki, Tempo, OpenTelemetry, Grafana, Alertmanager)
- Programmatically automate highly generic operations utilizing Python/Golang, Gitlab CI, or custom self-service solutions based on the AWS Service Catalog
- Build Docker images for multiple architectures (arm64, amd64)
- Troubleshoot issues involving microservices in Kubernetes, AWS connectivity, services performance, Lambda functions, and Kafka
- Participate in hypercare events and on-call shifts
- Engage with your mentor for continuous learning and development of technical and soft skills
- 2+ years of experience in DevOps engineering
- Deep understanding of AWS infrastructure (EKS, ECS on Fargate, Lambda, ECR, Load Balancing, VPC Endpoint, Route53, CloudWatch)
- Experience with Docker, Kubernetes, Grafana, Prometheus, Alertmanager, Helm, Terraform, CloudFormation, and Gitlab CI
- Strong scripting skills in Python and Bash
- Excellent troubleshooting skills involving microservices in Kubernetes, AWS connectivity, services performance, Lambda functions, and Kafka
- Experience in on-call shifts and hypercare events
- Proficient in English with at least an Upper-Intermediate level of competency
- Experience in Cortex, Tempo, Promtail/FluentBit, and Monitoring Mixins
- Familiarity with New Relic, DataDog, Kafka, and Elasticsearch
- Knowledge of Golang would be a plus
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn