Mexico
We are seeking an experienced Lead MLOps Engineer to join our Enterprise AI Products and Technology Team. You will lead and mentor a team of engineers, collaborating with data science teams to build tools, automate workflows, and establish standards for scalable machine learning solutions.
This role focuses on enhancing platform maturity, bridging technical and organizational gaps, and driving enterprise-wide AI initiatives such as clinical trial analysis, knowledge graph analytics, and deep learning-led medication discovery.
Leverage your leadership skills, MLOps expertise, and software engineering experience to drive scalability, automation, and continuous improvement across teams.
Responsibilities
- Lead and mentor a team of engineers, defining technical priorities, fostering collaboration, and guiding problem-solving efforts
- Collaborate with Data Scientists and Machine Learning Engineers to understand challenges and deliver scalable tools/platforms that streamline their workflows
- Drive continuous improvement in Machine Learning development environments, platforms, and tools to support data science initiatives at scale
- Work closely with governance and compliance functions (e.g., Cyber Security and Data Privacy) to design and implement secure systems that balance security and end-user productivity
- Adapt and optimize state-of-the-art machine learning methods for modern parallel computing environments (e.g., distributed clusters, multicore SMP, and GPU technologies)
- Champion a "production-first mindset," ensuring seamless transitions of data science projects from exploratory research to production
- Shape strategic initiatives such as defining best practices, conducting technical reviews, and aligning solutions with long-term AI platform goals
- Leverage expertise in container orchestration frameworks (Airflow, Argo, Kubeflow, etc.) to guide tool selection and optimization across teams
- Collaborate with enterprise-wide stakeholders to advocate for and implement scalable infrastructure using Infrastructure as Code principles
- Develop training programs to upskill teams on advanced MLOps practices and tools
- Track and report performance metrics for MLOps tools, environments, and production pipelines to stakeholders and leadership
Requirements
- BSc/MSc/Ph.D in Computer Science, Data Engineering, or a related quantitative or analytical field
- 5+ years of experience building and delivering production-grade software with significant expertise in Python programming (similar expertise in other languages will be considered)
- At least 1 year of relevant leadership experience
- Proven experience in software engineering, automation, and DevOps with a demonstrated ability to lead and deliver impactful projects
- Extensive experience developing, deploying, and scaling production-grade machine learning products or similar enterprise-scale software systems
- Deep understanding and practical experience with at least one container orchestration framework (Airflow, Argo, Kubeflow, etc.) and ability to mentor teams in their adoption and use
- Substantial experience deploying and managing Machine Learning or Data Science infrastructure at scale using Infrastructure as Code (e.g., Terraform, CloudFormation, etc.)
- Strong track record of working in Agile teams with proven leadership in cross-functional collaboration
- Proven ability to work effectively with governance functions and adhere to internal security standards while promoting productive workflows
- Exceptional problem-solving and communication skills, with strong stakeholder management at all levels of an organization
Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn