Skip To Main Content
backBack to Search

Lead Data Software Engineer

Remote in Colombia, Mexico
Data Software Engineering
& 11 others

We are in search of a seasoned Lead Data Software Engineer with competency in full-stack development and a leadership-driven perspective, combined with an automation-first approach to engineering within a modern cloud data warehouse stack (BigQuery/Databricks).

In this role, you will lead the design and development of scalable, production-grade data infrastructure while supporting and working with Engineers, Data Analysts, and Data Scientists to generate actionable real-time insights and empower senior leadership to make strategic data-driven decisions.

The ideal candidate brings technical expertise alongside leadership capabilities, thrives in a highly code-focused environment, and is strongly committed to automation, system optimization, and clean coding practices.

Responsibilities
  • Spearhead the design and development of high-performance, fault-tolerant data pipelines using Python and SQL, prioritizing scalability, efficiency, and automation
  • Manage the architecture and implementation of end-to-end, production-grade data systems, aligning ingestion, transformation, and model deployment workflows into robust solutions
  • Maintain responsibility for building and sustaining real-time streaming pipelines and batch data workflows leveraging BigQuery/Databricks, Apache Airflow, and DBT
  • Establish and promote clean, modular code standards, emphasizing reusability and automated solutions for manual data engineering tasks
  • Collaborate with cross-functional teams to translate complex business goals into scalable technical strategies, focusing heavily on automation and operational excellence
  • Design and integrate advanced tools for monitoring, logging, and alerting to improve the reliability and scalability of data infrastructure
  • Partner with application development teams to synchronize backend workflows with broader business logic and software frameworks
  • Drive discussions and decision-making processes related to architecture, pipelines, and cloud infrastructure in data engineering projects
  • Mentor and coach junior and senior engineers, encouraging a culture of continuous learning, knowledge sharing, and technical growth within the team
  • Address and resolve inefficiencies in data workflows while proactively optimizing system performance and scalability
Requirements
  • Knowledge of Computer Science, Software Engineering, or a related field at a BS/MS level
  • Background in production-grade data engineering with 5+ years of experience, centered on full-stack development and automation
  • At least 1 year of leadership experience in a relevant context
  • Expertise in Python, SQL, and data processing frameworks like Spark/PySpark for large-scale data systems
  • Understanding of modern Cloud Data Warehousing tools such as BigQuery or Databricks along with Cloud-native architectures (AWS/GCP/Azure)
  • Hands-on experience with CI/CD pipelines, version control systems (Git), and advanced testing frameworks
  • Familiarity with containerization (Docker) and orchestration platforms (Kubernetes) for scaling data applications in distributed environments
  • Proficiency in using workflow orchestration technologies like Apache Airflow and DBT for creating automated workflows
  • Showcase of event-driven architectures and streaming systems like Kafka or Kinesis tailored for real-time data applications
  • Background in Agile, DevOps, or DataOps methodologies and practical use of infrastructure-as-code tools like Terraform or Pulumi
  • English level B2+ for effective communication
Nice to have
  • Familiarity with MySQL and visualization platforms such as Looker or Tableau, as well as advanced analytics tools like Amplitude, Snowplow, or Segment
  • Background in cloud DevOps practices, managing infrastructure and deployments on AWS, GCP, or Azure
  • Understanding of Linux/Unix system administration paired with shell scripting skills
  • Capability to work on machine learning pipelines, MLOps techniques, and deploying ML models into production environments
  • Proficiency in developing real-time analytics solutions using streaming systems like Apache Flink or Spark Streaming
Benefits
  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn