Colombia
Join our team as a Lead DevOps Engineer and take charge of managing and optimizing cloud infrastructure for cutting-edge Generative AI applications using GCP, GKE, and Python.
Your skills will be pivotal in ensuring the performance and scalability of our AI platform. If you have a strong interest in AI and cloud technologies, we welcome your application.
Responsibilities
- Design and manage scalable, secure cloud infrastructure using GCP
- Support and integrate Python-based AI frameworks and tools
- Build automated CI/CD pipelines for streamlined operations
- Implement solutions for monitoring, logging, and alerting of AI services
- Enforce security best practices and compliance with governance standards
Requirements
- 5+ years of experience in DevOps or cloud infrastructure
- 1+ years in a relevant leadership role
- Expertise in Google Kubernetes Engine (GKE) and VertexAI
- Proficiency in Python with experience in AI tools such as LiteLLM and Dify.AI
- Knowledge of AI governance and security best practices
- Strong background in cloud platforms, especially GCP
- B2+ level English fluency in communication skills
Nice to have
- Familiarity with containerization technologies like Docker
- Background in orchestration tools, including Kubernetes
- Knowledge of monitoring systems such as Prometheus or Grafana
- Understanding of BigQuery and other GCP services
- Experience with GenAI or Agentic AI frameworks
Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn