Lead Data Software Engineer
Colombia
We are searching for a highly skilled and experienced Lead Data Software Engineer to design, build, and optimize scalable data solutions on modern cloud platforms such as BigQuery or Databricks. In this position, you will lead a team of Data Engineers, advocate for automation-first development practices, and enable impactful business insights through reliable data infrastructure and engineering excellence.
Responsibilities
- Lead and mentor a team of Data Engineers, fostering growth and technical expertise
- Design and implement scalable data pipeline architectures using Python and SQL with automation-first principles
- Build and maintain batch and real-time data processing solutions using BigQuery, Databricks, Apache Airflow, and DBT
- Architect end-to-end data infrastructure solutions that adhere to CI/CD and DevOps best practices
- Develop clean, reusable code that emphasizes maintainability, scalability, and fault tolerance
- Integrate data workflows with full-stack development, bridging business applications and data engineering
- Implement monitoring, alerting, and observability systems for data infrastructure reliability
- Collaborate with cross-functional teams to address complex technical challenges and align data solutions with business goals
- Develop data APIs, self-service tools, and automation frameworks to improve data accessibility and usability
- Optimize performance and cost-efficiency of data systems and pipelines on cloud platforms
- Ensure secure, compliant data handling by implementing governance frameworks and quality monitoring
Requirements
- BS/MS in Computer Science, Software Engineering, or a related field
- 5+ years of experience in production-grade data engineering with solid expertise in automation and full-stack data development
- At least 1 year of relevant leadership experience
- Proficiency in software engineering principles like version control (Git), CI/CD workflows, and testing frameworks
- Expertise in Python and modern cloud data platforms such as Databricks or BigQuery
- Background in cloud-native architecture using AWS, GCP, or Azure with a focus on data processing pipelines
- Familiarity with containerization and orchestration tools like Docker and Kubernetes
- Capability to implement and manage data pipeline orchestration using Apache Airflow, DBT, or similar tools
- Skills in building APIs, event-driven services, and integrating microservices into data workflows
- Understanding of DataOps and DevOps practices, including infrastructure as code and automation workflows
- Knowledge of event-driven architectures and streaming platforms like Kafka or Kinesis
- Showcase of creating software solutions for automated data governance, quality monitoring, and data products
- English proficiency at a B2+ level
Nice to have
- Background in MySQL, Looker/Tableau, and analytics tools such as Amplitude or Segment
- Experience with MLOps for machine learning pipelines and model deployment
- Familiarity with real-time analytics platforms and data streaming technologies
- Qualifications in basic Linux/Unix administration and shell scripting
- Expertise with cloud DevOps tools for CI/CD, monitoring, and infrastructure as code solutions like Terraform or Pulumi
Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn