Lead Data Software Engineer

Data Software Engineering

Location-specific conditions & benefits*

Colombia

We are in search of a seasoned Lead Data Software Engineer with competency in full-stack development and a leadership-driven perspective, combined with an automation-first approach to engineering within a modern cloud data warehouse stack (BigQuery/Databricks).

In this role, you will lead the design and development of scalable, production-grade data infrastructure while supporting and working with Engineers, Data Analysts, and Data Scientists to generate actionable real-time insights and empower senior leadership to make strategic data-driven decisions.

The ideal candidate brings technical expertise alongside leadership capabilities, thrives in a highly code-focused environment, and is strongly committed to automation, system optimization, and clean coding practices.

Responsibilities

Spearhead the design and development of high-performance, fault-tolerant data pipelines using Python and SQL, prioritizing scalability, efficiency, and automation
Manage the architecture and implementation of end-to-end, production-grade data systems, aligning ingestion, transformation, and model deployment workflows into robust solutions
Maintain responsibility for building and sustaining real-time streaming pipelines and batch data workflows leveraging BigQuery/Databricks, Apache Airflow, and DBT
Establish and promote clean, modular code standards, emphasizing reusability and automated solutions for manual data engineering tasks
Collaborate with cross-functional teams to translate complex business goals into scalable technical strategies, focusing heavily on automation and operational excellence
Design and integrate advanced tools for monitoring, logging, and alerting to improve the reliability and scalability of data infrastructure
Partner with application development teams to synchronize backend workflows with broader business logic and software frameworks
Drive discussions and decision-making processes related to architecture, pipelines, and cloud infrastructure in data engineering projects
Mentor and coach junior and senior engineers, encouraging a culture of continuous learning, knowledge sharing, and technical growth within the team
Address and resolve inefficiencies in data workflows while proactively optimizing system performance and scalability

Requirements

Knowledge of Computer Science, Software Engineering, or a related field at a BS/MS level
Background in production-grade data engineering with 5+ years of experience, centered on full-stack development and automation
At least 1 year of leadership experience in a relevant context
Expertise in Python, SQL, and data processing frameworks like Spark/PySpark for large-scale data systems
Understanding of modern Cloud Data Warehousing tools such as BigQuery or Databricks along with Cloud-native architectures (AWS/GCP/Azure)
Hands-on experience with CI/CD pipelines, version control systems (Git), and advanced testing frameworks
Familiarity with containerization (Docker) and orchestration platforms (Kubernetes) for scaling data applications in distributed environments
Proficiency in using workflow orchestration technologies like Apache Airflow and DBT for creating automated workflows
Showcase of event-driven architectures and streaming systems like Kafka or Kinesis tailored for real-time data applications
Background in Agile, DevOps, or DataOps methodologies and practical use of infrastructure-as-code tools like Terraform or Pulumi
English level B2+ for effective communication

Nice to have

Familiarity with MySQL and visualization platforms such as Looker or Tableau, as well as advanced analytics tools like Amplitude, Snowplow, or Segment
Background in cloud DevOps practices, managing infrastructure and deployments on AWS, GCP, or Azure
Understanding of Linux/Unix system administration paired with shell scripting skills
Capability to work on machine learning pipelines, MLOps techniques, and deploying ML models into production environments
Proficiency in developing real-time analytics solutions using streaming systems like Apache Flink or Spark Streaming

Benefits

International projects with top brands
Work with global teams of highly skilled, diverse peers
Healthcare benefits
Employee financial programs
Paid time off and sick leave
Upskilling, reskilling and certification courses
Unlimited access to the LinkedIn Learning library and 22,000+ courses
Global career opportunities
Volunteer and community involvement opportunities
EPAM Employee Groups
Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn