Back to Search
Middle Site Reliability Engineer (Azure)
Microsoft Azure, Azure Application Insights, Azure Cosmos DB, Azure Monitor, Azure DevOps, Azure Kubernetes Service, Bash, PowerShell, Troubleshooting and tracing in distributed systems
Sorry, this position is no longer available
We are looking for a Middle Site Reliability Engineer (Azure) to join our remote team.
You will be responsible for analyzing and discovering how all components of a distributed system work together, using a broad range of skills and tools. As a person passionate about problem-solving, you will be required to work autonomously and make decisions that will help us maintain our systems' reliability and availability. This role involves working with Microsoft Azure, and you will be expected to have a strong understanding of Azure's various tools and services.
Responsibilities
- Monitor and analyze our systems' performance, ensuring they meet our Service Level Objectives (SLOs)
- Configure and deploy Azure cloud resources, including AKS, CosmosDB, Key Vault, Redis Cache, Storage, ServiceBus, App Gateway, etc.
- Develop and implement effective monitoring and alerting strategies, utilizing tools such as Azure Monitor or equivalent
- Collaborate with cross-functional teams to identify and resolve issues that may affect system reliability and availability
- Design and implement solutions that increase system reliability, availability, and scalability
- Participate in an on-call rotation, providing support outside of regular business hours
- Participate in mentorship and knowledge-sharing activities to help grow your technical and soft skills
Requirements
- At least 2 years of experience in a Site Reliability Engineer role, preferably in a Microsoft Azure environment
- Excellent knowledge of Azure Application Insights, Azure Monitor, and other Azure services
- Experience in configuring and deploying Azure cloud resources, e.g. AKS, CosmosDB, Key Vault, Redis Cache, Storage, ServiceBus, App Gateway, etc.
- Strong programming skills in Bash and PowerShell
- Experience in DevOps best practices, including setting up and maintaining CI/CD implementation using Azure DevOps or equivalent
- Ability to troubleshoot and trace issues in a distributed system
- Excellent team player, with the ability to communicate effectively with multiple teams
- Fluent in spoken and written English, with an Upper-Intermediate level of competency
Nice to have
- Ability to work on-call during weekends
Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn