Senior Data Software Engineer (Databricks)
Argentina
We are looking for an experienced and highly motivated Senior Data Software Engineer with deep expertise in Databricks and data streaming technologies to join our dynamic team.
In this role, you will leverage your expertise in big data engineering, cloud technologies, and real-time data streaming to build robust, scalable, and efficient data platforms that power business-critical insights.
Responsibilities
- Implement and optimize data pipelines within Databricks, applying medallion architecture principles for efficient data organization
- Develop and maintain batch and streaming pipelines, including Stream Tables, Delta Live Tables, Change Data Capture (CDC), and Slowly Changing Dimensions (SCD)
- Manage Databricks Asset Bundles (DABs) for packaging, deployment, and artifact versioning
- Oversee workflows, job orchestration, and scheduling on Databricks to ensure reliability
- Design and manage real-time streaming data platforms using technologies like Apache Kafka, Confluent, and Redpanda
- Implement a Schema Registry to enforce data contracts and schema compatibility
- Build scalable data processing solutions using Spark, SQL, and Python
- Work with relational and non-relational databases, such as MySQL, PostgreSQL, and DynamoDB
- Optimize database queries to meet operational and analytical needs
- Collaborate with cross-functional teams to define requirements and deliver data solutions
- Ensure high-quality engineering standards by leveraging CI/CD pipelines and Git for version control
Requirements
- 3+ years of experience in Data Software Engineering
- Hands-on experience with the Databricks platform, including Spark, Delta Lake, Unity Catalog, and workflows
- Expertise in designing and implementing ETL/ELT processes, including batch and stream pipelines, Change Data Capture (CDC), and Slowly Changing Dimensions (SCD)
- Advanced proficiency in Spark programming, SQL optimization, and Python scripting
- Practical experience with event-driven architectures using streaming technologies like Kafka, Confluent, or Redpanda
- Strong knowledge of cloud platforms such as AWS or GCP for data infrastructure management
- Proficiency in relational and non-relational databases, including MySQL, PostgreSQL, and DynamoDB
- Understanding of data modeling techniques, including star schema and snowflake schema, for analytics use cases
- Familiarity with CI/CD pipelines, version control tools like Git, and infrastructure-as-code solutions like Terraform
- Strong problem-solving skills with an ability to identify and address technical challenges
- Excellent communication skills to collaborate effectively across technical and non-technical teams
Nice to have
- Knowledge of data governance frameworks and compliance standards like GDPR, CCPA, or SOC2
- Familiarity with big data tools such as Apache Hadoop or Snowflake
- Relevant certifications, such as Databricks Data Engineer Associate or AWS Cloud certifications
We offer/Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn