Looking for something else?
Find a vacancy that works for you. Send us your CV to receive a personalized offer.
Find me a jobWe are seeking a Lead DevOps Engineer with expertise in incident and request management, and hands-on experience with monitoring and observability tools such as Dynatrace, Grafana, and Splunk.
This role focuses on monitoring setup, tool administration, and resolving medium complexity tickets, ensuring robust support and operational excellence across the organization.
Responsibilities
- Develop and maintain documentation outlining best practices for logging and monitoring within the company
- Conduct regular audits to verify logging and monitoring practices align with company policies and industry standards
- Participate in cross-functional discussions and initiatives to promote logging and monitoring best practices throughout the organization
- Manage monitoring, alerting, operability, and observability for applications using tools like Dynatrace, Splunk, and Grafana
- Triage incoming tickets, update ticket details, and assess urgency for appropriate response
- Review documentation to escalate tickets that require troubleshooting beyond Level 2 capabilities
- Provide warm handoff notes for tickets escalated to higher support levels
- Create and leverage documentation for handling standard incidents and requests
- Define average completion time per ticket and establish Service Level Objectives (SLOs) for each product request type
- Review and present metrics and escalated tickets regularly to document and improve the support process
- Manage incidents and requests related to monitoring setup and tool administration, utilizing JIRA for ticket tracking
- Be available for monitoring and escalation during off-hours, weekends, and carry pager duty for emergency situations
Requirements
- Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or equivalent experience
- Minimum 5 years of relevant professional experience in DevOps or related fields
- At least one year of experience in people management or leading a team of 5 or more members
- Strong understanding of observability, including monitoring, logging, and tracing practices
- Hands-on experience with Dynatrace, Splunk, Grafana, and other monitoring and logging tools for application and infrastructure management
- Experience with Azure logging and monitoring tools such as Log Analytics, Azure Monitor, and App Insights
- Proven ability to operate high-availability, fault-tolerant, scalable, distributed software in production environments
- Excellent oral and written communication skills in English at B2+ level or higher
