Skip To Main Content
backBack to Search

Senior DevOps Engineer

Microsoft Azure, Azure API Management, Docker, Kubernetes, Prometheus, Terraform, Argo CD, Artificial intelligence, Machine Learning

We are seeking a skilled Senior DevOps Engineer who can oversee our large-scale infrastructure for high-stakes, public-facing products. Candidates should bring a wealth of hands-on experience and strategic insight to our evolving DevOps operations, driving efficiency and reliability across our systems.

Responsibilities
  • Maintain and improve the stability of our site reliability engineering efforts to better serve our infrastructure needs at scale
  • Develop, implement, and manage CI/CD pipelines, focusing prominently on automation and deployment frequency improvements
  • Design and maintain infrastructures with cloud computing platforms like AWS, GCP, or Azure
  • Utilize infrastructure-as-code tools such as Terraform and Ansible for configuration and deployment activities
  • Deploy and manage containerized applications using Docker and Kubernetes
  • Monitor system health and performance with tools such as Prometheus and Grafana, diagnosing and resolving issues promptly
  • Scale and optimize web sockets based infrastructure to support substantial traffic loads
  • Collaborate cross-functionally to ensure project requirements, deadlines, and schedules are on track
  • Provide detailed documentation and system diagrams to effectively communicate system design and architecture
Requirements
  • Background in Site Reliability Engineering with at least 3 years of experience, especially in production environments
  • Familiarity with Python or similar OOP languages
  • Proficiency in cloud computing platforms including AWS, GCP or Azure
  • Expertise in implementing CI/CD processes
  • Competency in containerization technologies and orchestration with Docker and Kubernetes
  • Skills in monitoring tools like Prometheus and Grafana
  • Outstanding problem-solving capability and an attention to detail
  • Proven track record of delivering reliable, efficient, and scalable infrastructure
Nice to have
  • Experience with Azure
  • Experience or involvement with ML/AI projects
Benefits
  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn