Skip To Main Content
backBack to Search

Lead Data Software Engineer

Hybrid in Portugal
Data Software Engineering
& 6 others

We are seeking a dedicated and experienced Lead Data Software Engineer to join our team, concentrating on creating a next-generation framework for centralized platform infrastructure.

You will oversee the development of core infrastructure components, facilitate parser pipeline integration, and uphold operational excellence through enhanced practices in monitoring, cost management, and event-driven architectures.

Responsibilities
  • Design scalable, centralized platform architecture tailored for data parser pipelines
  • Integrate solutions for processing tabular and unstructured datasets using frameworks compatible with Unity Catalog tables and volumes
  • Implement robust observability methods to ensure consistent logging and monitoring capabilities
  • Collaborate with teams to enhance framework efficiency while reducing operational costs
  • Support ingestion workflows from the raw data layer through cloud-native technologies
  • Integrate modern infrastructure tools to extend the functionality of the framework
  • Provide mentorship and establish coding standards aligned with best practices
  • Coordinate onboarding procedures across diverse parser pipelines with stakeholders
  • Address performance bottlenecks and implement optimization strategies for data pipelines
  • Maintain awareness of emerging technologies to ensure alignment with future framework requirements
Requirements
  • 5+ years of industry experience in software engineering with a focus on cloud systems or data infrastructure
  • 1+ years of leadership experience in relevant roles
  • Proficiency in Python and knowledge of Azure Databricks
  • Expertise in Azure Kubernetes, Azure Data Explorer, and Azure DataLake storage
  • Strong understanding of core cloud computing principles, especially Azure-related capabilities
  • Hands-on experience with Terraform for Infrastructure-as-Code and Azure DevOps for repositories, pipelines, and artifact management
  • Background in leveraging observability tools for centralized logging and monitoring
  • Skills in PySpark for processing and transforming large-scale datasets
Nice to have
  • Competency in cost management strategies for scalable cloud infrastructure
  • Familiarity with Unity Catalog for organizing tabular and unstructured data assets
  • Flexibility to use event-driven designs in file ingestion workflows
We offer/Benefits
  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn