Lead Data Software Engineer
We are seeking a seasoned Lead Data Software Engineer with a strong background in full-stack development and a leadership-focused mindset, paired with an automation-first approach to engineering within a modern cloud data warehouse stack (BigQuery/Databricks).
In this role, you will spearhead the design and development of scalable, production-ready data infrastructure while mentoring and collaborating with Engineers, Data Analysts, and Data Scientists to generate actionable real-time insights and enable senior leadership to make informed data-driven decisions. The ideal candidate possesses technical expertise alongside leadership qualities, thrives in a highly code-oriented environment, and is deeply dedicated to automation, system performance, and clean coding practices.
- Lead the design and development of high-performance, fault-tolerant data pipelines using Python and SQL, prioritizing scalability, efficiency, and automation
- Oversee the architecture and implementation of end-to-end, production-grade data systems, integrating ingestion, transformation, and model deployment workflows into robust solutions
- Take ownership of building and maintaining real-time streaming pipelines and batch data workflows leveraging BigQuery/Databricks, Apache Airflow, and DBT
- Establish and advocate for clean, modular code standards with a focus on reusability and automating manual data engineering tasks
- Collaborate actively with cross-functional teams to drive the translation of complex business requirements into scalable technical solutions, with an emphasis on automation and operational excellence
- Design and implement advanced tools for monitoring, logging, and alerting to enhance the reliability and scalability of data infrastructure
- Work closely with application development teams to align backend system workflows with broader business logic and software components
- Lead discussions and decision-making processes regarding architecture, pipelines, and cloud infrastructure in data engineering initiatives
- Mentor and guide junior and senior engineers, fostering a culture of technical growth, knowledge sharing, and continual improvement within the team
- Identify and resolve bottlenecks in data workflows while proactively improving system performance and scalability
- BS/MS in Computer Science, Software Engineering, or a related field
- 5+ years of experience in production-grade data engineering, with a focus on full-stack development and automation
- At least 1 year of relevant leadership experience
- Advanced proficiency in Python, SQL, and data processing frameworks such as Spark/PySpark for large-scale data systems
- Deep expertise in modern Cloud Data Warehousing tools like BigQuery or Databricks, coupled with a strong understanding of cloud-native architectures (AWS/GCP/Azure)
- Proven hands-on experience with CI/CD pipelines, version control (Git), and advanced testing frameworks
- Advanced-level familiarity with containerization (Docker) and orchestration technologies (Kubernetes) to scale data applications in distributed environments
- Extensive experience with workflow orchestration tools such as Apache Airflow and DBT for automating complex workflows
- In-depth knowledge of event-driven architectures and streaming systems (e.g., Kafka, Kinesis) to support real-time data applications
- Strong background in Agile, DevOps, or DataOps methodologies, including hands-on use of infrastructure-as-code tools like Terraform or Pulumi
- Exceptional communication, collaboration, and leadership skills, aligning with a minimum of an English B2+ proficiency level
- Proficiency in MySQL and experience with visualization platforms like Looker/Tableau, or large-scale analytics tools such as Amplitude, Snowplow, or Segment
- Proven cloud DevOps experience in managing infrastructure and deployments on platforms like AWS, GCP, or Azure
- Fundamental Linux/Unix system administration and shell scripting skills
- Practical knowledge of machine learning pipelines, MLOps techniques, and the deployment of ML models into production
- Experience delivering real-time analytics solutions using streaming technologies like Apache Flink or Spark Streaming
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn