Senior Data Software Engineer
Colombia
We are looking for a highly skilled Senior Data Software Engineer with qualifications in full-stack development and an automation-first approach to engineering within a modern cloud data warehouse stack, including BigQuery and Databricks.
This position focuses on crafting scalable, production-ready data infrastructure while collaborating with Engineers, Data Analysts, and Data Scientists to enable real-time insights and support data-driven decision-making for senior leadership. The ideal candidate is a hands-on contributor who thrives in code-intensive environments, emphasizing automation, performance optimization, and adherence to clean code principles.
Responsibilities
- Design and develop high-performance, fault-tolerant data pipelines using Python and SQL, emphasizing scalability and automation
- Architect end-to-end production-ready data solutions, integrating ingestion, transformation, and model deployment workflows
- Build and maintain real-time streaming pipelines and batch data workflows by leveraging BigQuery, Databricks, Apache Airflow, and DBT
- Write clean and modular code, prioritizing reusability and automating manual data engineering tasks
- Collaborate with cross-functional teams to translate business needs into technical solutions, prioritizing automation-friendly designs
- Implement tools for monitoring, logging, and alerting to ensure data pipelines' reliability and scalability
- Integrate data workflows with broader application development efforts, aligning backend and business logic seamlessly
- Contribute to design discussions on architecture, pipelines, and cloud infrastructure related to data engineering projects
Requirements
- BS/MS in Computer Science, Software Engineering, or related field
- 3+ years of experience in production-focused data engineering with a focus on full-stack development and automation
- Proficiency in Python, SQL, and data frameworks such as Spark or PySpark for large-scale data processing
- Expertise in modern Cloud Data Warehousing platforms, including BigQuery or Databricks, with understanding of cloud-native architectures on AWS, GCP, or Azure
- Hands-on experience with CI/CD pipelines, version control tools like Git, and testing frameworks
- Competency in containerization and orchestration tools like Docker and Kubernetes for scalable data applications
- Understanding of workflow orchestration using tools such as Apache Airflow and DBT for pipeline automation
- Familiarity with event-driven architectures and streaming platforms, such as Kafka or Kinesis
- Background in Agile, DevOps, or DataOps practices, including infrastructure-as-code technologies like Terraform or Pulumi
- Strong communication skills in English with a minimum proficiency level of B2
Nice to have
- Familiarity with MySQL and visualization tools like Looker or Tableau, as well as large-scale analytics platforms such as Amplitude, Snowplow, or Segment
- Background in cloud DevOps, working with AWS, GCP, or Azure
- Basic qualifications in Linux or Unix system administration and shell scripting
- Understanding of machine learning pipelines, MLOps practices, and deploying ML models to production
- Showcase of real-time analytics solutions and experience with streaming technologies such as Apache Flink or Spark Streaming
Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn