Skip To Main Content
backBack to Search

Lead Data Software Engineer (Databricks)

Remote in Argentina, Mexico

We are seeking a skilled and highly driven Lead Data Software Engineer with expertise in Databricks and data streaming technologies to join our innovative team.

This position requires leveraging skills in big data engineering, cloud platforms, and real-time streaming to create scalable, efficient data platforms that enable critical business insights.

Responsibilities
  • Architect data pipelines within Databricks using principles like medallion architecture for structured data management
  • Build and support both batch and real-time pipelines, including Stream Tables, Delta Live Tables, Change Data Capture (CDC), and Slowly Changing Dimensions (SCD)
  • Manage Databricks Asset Bundles (DABs) for deployment, versioning, and artifact management
  • Coordinate workflows, job schedules, and orchestration within Databricks to ensure operational consistency
  • Establish and maintain real-time streaming data platforms using tools like Apache Kafka, Confluent, or Redpanda
  • Utilize a Schema Registry to uphold data contracts and enforce schema validity
  • Develop efficient data processing solutions leveraging Spark, SQL, and Python
  • Handle relational and non-relational databases, including MySQL, PostgreSQL, and DynamoDB
  • Fine-tune database queries to enhance both operational and analytical performance
  • Collaborate with cross-functional teams to gather requirements and deliver effective data systems
  • Promote high-quality engineering practices by leveraging CI/CD workflows and version control tools such as Git
Requirements
  • 5+ years of experience in Data Software Engineering
  • Practical skills in Databricks, including Spark, Delta Lake, Unity Catalog, and workflow management
  • Expertise in building ETL/ELT pipelines, including batch and real-time streaming solutions, Change Data Capture (CDC), and Slowly Changing Dimensions (SCD)
  • Advanced skills in Spark programming, SQL queries, and Python scripting
  • Proficiency with event-driven architectures through tools like Kafka, Confluent, or Redpanda
  • Knowledge of cloud platforms such as AWS or GCP for managing data infrastructure
  • Familiarity with relational and non-relational databases, such as MySQL, PostgreSQL, and DynamoDB
  • Understanding of advanced data modeling approaches like star schema and snowflake schema for analytics
  • Capability to work with CI/CD pipelines, Git for version control, and tools like Terraform for infrastructure-as-code requirements
  • Effective problem-solving skills to address technical and architectural challenges
  • Strong communication skills for collaboration across diverse teams
Nice to have
  • Knowledge of data governance standards and compliance protocols like GDPR, CCPA, or SOC2
  • Understanding of big data tools such as Apache Hadoop or Snowflake
  • Relevant certifications, including Databricks Data Engineer Associate or AWS specialized credentials
We offer/Benefits
  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn