Skip To Main Content
backBack to Search

Data Software Engineer, Azure Databricks

Remote in Kazakhstan
Data Software Engineering
hot
Looking for something else?

Find a vacancy that works for you. Send us your CV to receive a personalized offer.

Find me a job

We are looking for a Data Software Engineer to help build a scalable solution used by data engineers to build, test, deploy and monitor data pipelines. You will develop reusable libraries and SDK modules that enable teams to ship data products autonomously and efficiently, working with Databricks, Delta Lake, Azure Data Lake and SAP HANA Data Lake.

Responsibilities
  • Develop and maintain reusable SDK modules (Python) for data pipeline operations such as compaction, quality checking, schema evolution and table lifecycle management
  • Design and build framework libraries following software engineering best practices, ensuring scalability and reliability across 100+ pipelines and diverse data domains
  • Collaborate with Data Engineers who consume the framework to gather feedback, understand pain points and iterate on the SDK
  • Contribute to platform architecture discussions and performance tuning strategies
  • Write high-quality documentation and contribute to enablement materials for framework consumers
  • Help define and infuse data engineering best practices through enablement, SDKs and templates
  • Ensure data quality, consistency and governance across Lakehouse environments
Requirements
  • Strong experience as a Data Engineer with proficiency in Databricks
  • 3+ years of experience building reusable libraries, SDKs or internal developer tooling
  • Knowledge of Data Mesh/Data Product concepts including data ownership, domain-oriented design and self-serve data platforms
  • Deep expertise in Delta Lake, Delta tables and compaction with optimization for high-performance workloads
  • Proven experience designing and maintaining complex data pipelines on cloud object stores (ADLS, etc.)
  • Strong programming skills in Python for data engineering workloads
  • Solid understanding of Lakehouse architecture and best practices for large-scale data platforms
  • Hands-on experience with data pipeline monitoring, troubleshooting and performance tuning
  • Familiarity with CI/CD and workflow orchestration (Databricks Jobs)
  • Experience working in agile teams with a focus on ownership, autonomy and best practices
  • Excellent problem-solving skills and capability to handle high-scale, complex data challenges
  • English proficiency at B2 level or higher
Nice to have
  • Experience with agile methodologies (Scrum, SAFe)
  • Experience developing enablement materials and technical documentation