Lead Data Scientist Jobs
EPAM is looking for Lead Data Scientists.
Data Software Engineering
Apache Spark, Microsoft Azure, Python
40 hrs/week
12+ months
- Collaborate with cross-functional teams to build and maintain the data integration solution
- Develop, build, and optimize data pipelines using Apache Spark, Microsoft Azure, and Python
- Ensure data pipelines are scalable, maintainable, and reliable
- Take data science models and make them production ready
- Develop and maintain forecasting models to support business decisions
- Ensure data quality and consistency across all data sources
- Monitor and optimize data pipelines to ensure efficient and effective data processing
- Collaborate with data scientists to develop and implement machine learning models
- Work with the team to continuously improve and optimize the data integration solution for Fedex
- Stay current with emerging technologies and trends in data software engineering
- At least 3+ years of experience in data software engineering or similar roles
- Expertise in Apache Spark, Microsoft Azure, and Python
- Strong knowledge of forecasting models, data science, and MLOps
- Experience working with Databricks to build, maintain, and optimize data pipelines
- Ability to take data science models and make them production ready
- Experience with Git for version control
- Understanding of basic Azure concepts including clouds, regions, ADLS, and compute
- Strong analytical and problem-solving skills with the ability to think critically and creatively
- Experience working in Agile development environments
- Excellent English communication skills, both written and verbal (B2+ level)
- Experience with Panda for data manipulation and analysis
- Strong knowledge of SQL and relational tables
- Understanding of statistical models and the ability to develop models utilizing Python, Spark, etc.
- Knowledge of ML tools space
Java
Google Cloud Dataflow, Spring, Google Cloud Platform
40 hrs/week
12+ months
- Design and create robust data pipelines using Java and Apache Beam in Google's DataFlow environment
- Employ the Spring framework to support our pipelines
- Use Google Cloud Platform services to deploy and manage data pipelines
- Collaborate with data scientists and analysts to implement solutions for data modeling, mining, and extraction processes
- Optimize data pipelines for performance and scalability, identify bottlenecks and implement improvements
- Develop and maintain documentation for data pipeline architectures and operational procedures
- Ensure data integrity and compliance with all governance and security policies throughout the data processing lifecycle
- Monitor pipeline performance, implementing logging and alerting mechanisms to preemptively detect and address issues
- A Bachelor's or Master's degree in Computer Science, Engineering, Information Technology, or a related field
- At least 3 years of experience in data pipeline development, backed by a strong Java and cloud-based data processing technology background
- A proven track record of designing, implementing, and optimizing data pipelines capable of processing large data sets in a cloud environment
- A thorough understanding of the Spring Framework, including Spring Boot, for building high-performance applications
- Hands-on experience with Google DataFlow and Apache Beam for building and managing data pipelines
- Experience with GCP services, particularly those related to data storage, processing, and analytics
- Proficiency in using version control systems, such as Git for code management and collaboration
- B2+ English level proficiency
PHP
PHP Components and Frameworks, REST API
40 hrs/week
12+ months
- Develop and maintain web applications using PHP and React
- Ensure seamless integration of AI models into web applications to enhance functionality and user experience
- Collaborate with data scientists and engineers to integrate AI models
- Deploy AI solutions effectively, meeting performance, scalability, and security requirements
- Write clean, maintainable, and efficient code following best practices
- Troubleshoot and debug applications, providing timely and effective solutions to issues
- Collaborate with cross-functional teams, including data scientists, engineers, and product managers, to support AI and analytics initiatives
- Participate in code reviews and provide constructive feedback to peers
- Stay updated with emerging technologies and industry trends in AI and web development
- Proactively implement new tools and practices to enhance application performance and functionality
- Develop and maintain comprehensive documentation for software and AI integration processes
- Implement best practices for data management, ensuring data security and compliance with relevant regulations
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
- 5 years minimum experience in full-stack development
- 1+ years of relevant leadership experience
- Demonstrated experience in integrating AI models into web applications
- Strong proficiency in PHP, React, AWS, and MySQL
- Knowledge of data engineering tools and practices, including data warehousing and ETL processes is a plus
- Familiarity with DevOps practices and cloud platforms (AWS, Azure, or GCP) is desirable
- Excellent collaboration and communication skills, with the ability to work effectively in a team environment
- Strong problem-solving abilities and attention to detail
- Commitment to continuous learning and professional development
- Experience with machine learning frameworks and libraries
- Previous experience in a similar role in the tech industry
- Knowledge of additional programming languages such as Python or Java
- Certification in cloud technologies or data management
Data Software Engineering
Databricks, Python, Amazon Web Services
40 hrs/week
12+ months
- Build an enterprise-grade data platform
- Establish data governance and data quality
- Implement reusable Databricks components for data ingestion and analytic
- Work collaboratively with architects, technical leads and key individuals within other functional groups
- Actively participate in code review and test solutions to ensure they meet best practice specifications
- Write project documentation
- At least 3+ years of experience as a Python Developer
- Expertise in DSE, Python and Azure Databricks
- Experience with data modelling and hands-on development experience Big Date
- Cloud experience in designing, administering scalable and fault-tolerant systems
- Strong data-oriented personality and compliance awareness
- Experience with Amazon Web Services and SQL
- Upper-intermediate English level (B2+)
Data Software Engineering
Databricks, Python, Microsoft Azure
40 hrs/week
12+ months
- Design, develop and maintain data processing pipelines and data products to support business needs and objectives
- Collaborate with cross-functional teams to ensure high-quality deliverables
- Provide technical leadership and guidance to other developers in the team
- Participate in code review and ensure code quality and adherence to standards
- Work with stakeholders to define requirements and develop solutions that meet their needs
- Develop, test and deploy scalable and performant data solutions
- Ensure the reliability, availability, and scalability of data systems
- Stay up-to-date with the latest Big Data technologies and trends and apply them to solve business problems
- Drive continuous improvement and optimization of data processing pipelines
- At least 3+ years of experience in building Data Platform
- Successful implementations of Big Data high-performance solutions
- Expertise in Python and Databricks for developing and implementing data solutions
- Experience with Microsoft Azure Big Data Services for data processing and storage
- Experience with Event-driven architecture for building scalable and high-performance applications
- Solid knowledge of Data Architecture and Data Modeling
- Familiarity with Agile methodologies for software development
- Excellent communication skills in spoken and written English, at an upper-intermediate level or higher
- Experience with Machine Learning and AI technologies
- Experience with Spark and Hadoop
- Experience with NoSQL databases
Data Software Engineering
Databricks, Microsoft Azure, PySpark
40 hrs/week
12+ months
- Design and implement scalable data pipelines to support our cutting-edge applications
- Ensure data quality and data accuracy across all stages of data processing
- Collaborate with cross-functional teams to understand business requirements and develop solutions that meet their needs
- Develop and maintain codebase in accordance with industry best practices and standards
- Troubleshoot and resolve issues in a timely and effective manner
- Optimize data processing algorithms and improve application performance
- Ensure compliance with data security and data privacy regulations
- Conduct code reviews and ensure high code quality and compliance with standards and guidelines
- Participate in architectural and technical discussions to help shape the product roadmap
- Stay up-to-date with emerging trends and technologies in data engineering and analytics
- At least 3+ years of experience as a Data Software Engineer or in similar roles
- Expertise in one of the languages (Python, Spark, PySpark, SQL) for building scalable and high-performance applications
- Experience with Microsoft Azure for cloud-based infrastructure and application management
- Experience using Databricks for building robust data pipelines
- Experience using Azure DevOps, GitHub, or other version control systems
- Familiarity with developing end-to-end production solutions
- Ability to tie loose ends together for solutions across systems
- Excellent communication skills in spoken and written English, at an upper-intermediate level or higher
- Experience with GCP and AWS cloud platforms
- Experience with Apache Kafka and Apache Beam for building data pipelines
- Experience with machine learning and data science tools and frameworks
Data Software Engineering
Databricks, Python, PySpark
40 hrs/week
12+ months
- Conduct data analysis and troubleshootin
- Plan and implement new requirements/data entities on ED
- Provide support for integration testing
- Make sure data pipelines are scalable and efficient
- 3+ years of relevant work experience
- Must have skills in DSE Python and Azure Databricks
- Familiarity with EDL changes in DB Views/Stored procedures and integration testing support
- Advanced knowledge of PySpark, Azure Data Factory, and SQL
- Ability to collaborate effectively with the team
- Excellent communication skills with an upper-intermediate level of English
- Experience with HDInsight, Azure Data Lake, Data API, Spark, Scala, and Kafka will be an added advantage
Data Software Engineering
Databricks, Python, PySpark
40 hrs/week
12+ months
- Design and develop new features using the Agile development process (Scrum)
- Prioritize and ensure high-quality standards at every stage of development
- Guarantee reliability, availability, performance, and scalability of systems
- Maintain and troubleshoot code in large-scale, complex environments.
- Collaborate with Developers, Product and Program Management, and senior technical staff to deliver customer-centric solutions.
- Provide technical input for new feature requirements, partnering with business owners and architects
- Ensure continuous improvement by staying abreast of industry trends and emerging technologies
- Drive the implementation of solutions aligned with business objectives.
- Mentor and guide less experienced team members, helping them enhance their skills and grow their careers
- Participate in code reviews, ensuring code quality and adherence to standards
- Collaborate with cross-functional teams to achieve project goals
- Actively contribute to architectural and technical discussions
- At least 3+ years of production experience in Data Software Engineering
- Be hands-on with deep expertise in server development in Python and PySpark
- Deep expertise in Azure Data Factory for building scalable and high-performance applications
- Experience with Advanced SQL for designing and managing database schema, including procedures, triggers, and views
- Experience in Data analysis and troubleshooting
- Knowledge of Integration testing support for version control, integration, and deployment
- Support applications and systems in a production environment, ensuring timely resolution of issues
- Reviewing requirements and translating them into a documented technical design for implementation
- Exposure to Databricks, hdinishght, azure data lake, data api, Spark, Scala, Kafka for application packaging and deployment
- Expertise in Big Data Primary skills and Data background for designing and building scalable applications
- Excellent communication skills in spoken and written English, at an upper-intermediate level or higher
- Experience with EDL changes in DB Views/Stored procedures is a plus
Data Software Engineering
Databricks, Microsoft Azure, PySpark
40 hrs/week
12+ months
- Design and develop new features using the Agile development process (Scrum)
- Prioritize and ensure high-quality standards at every stage of development
- Guarantee reliability, availability, performance, and scalability of systems
- Maintain and troubleshoot code in large-scale, complex environments.
- Collaborate with Developers, Product and Program Management, and senior technical staff to deliver customer-centric solutions.
- Provide technical input for new feature requirements, partnering with business owners and architects
- Ensure continuous improvement by staying abreast of industry trends and emerging technologies
- Drive the implementation of solutions aligned with business objectives.
- Mentor and guide less experienced team members, helping them enhance their skills and grow their careers
- Participate in code reviews, ensuring code quality and adherence to standards
- Collaborate with cross-functional teams to achieve project goals
- Actively contribute to architectural and technical discussions
- At least 3 years of production experience in Data Software Engineering
- Expertise in Databricks, Microsoft Azure, PySpark, Python, and SQL for building both within development and enabling deployment to production
- Experience with Azure DevOps, GitHub, (or others), and version control for effective project management
- Ability to develop end-to-end production solutions
- Strong experience working on one or more cloud platforms such as Azure, GCP, AWS
- Experience in building out robust data pipelines
- Ability to tie loose ends together for solutions across systems
- Excellent communication skills in spoken and written English, at an upper-intermediate level or higher
- Experience with REST APIs and Power BI would be a plus
Data Software Engineering
Databricks, Microsoft Azure, PySpark
40 hrs/week
12+ months
- Engage in the Agile development process (Scrum) to conceive and implement innovative features
- Prioritize and uphold high-quality standards throughout each developmental phase
- Ensure the dependability, accessibility, performance, and scalability of systems
- Troubleshoot and maintain code within expansive, intricate environments
- Work in tandem with Developers, Product and Program Management, and seasoned technical professionals to furnish customer-centric solutions
- Provide technical insights for new feature requirements in collaboration with business owners and architects
- Stay abreast of industry trends and emerging technologies for continuous improvement
- Champion the execution of solutions aligned with business objectives
- Guide and mentor less seasoned team members, fostering skill enhancement and career growth
- Participate in code reviews, ensuring adherence to standards and code quality
- Collaborate seamlessly with cross-functional teams to achieve project objectives
- Actively contribute to architectural and technical discourse
- A minimum of 3 years of hands-on experience in Data Software Engineering
- Proficiency in Databricks, Microsoft Azure, PySpark, Python, and SQL for development and deployment in production
- Familiarity with Azure DevOps, GitHub (or alternative platforms), and version control for efficient project management
- Capability to develop comprehensive end-to-end production solutions
- Robust experience on one or more cloud platforms such as Azure, GCP, AWS
- Proven track record in constructing resilient data pipelines
- Capacity to integrate disparate elements for solutions spanning multiple systems
- Exceptional communication skills in both spoken and written English, at an upper-intermediate level or higher
- Experience with REST APIs and Power BI would be an advantage
Other skills
- Full Stack Java Developer
- Full Stack Software Engineer
- Lead Software Engineer
- Software Automation Engineer
- Java Software Developer
- Python Software Developer
- Full Stack Python Engineer
- Lead Automation Tester
- Lead Data Analyst
- Lead Java Developer
- Software Engineer
- Front End Software Developer
- Lead DevOps Engineer
- Lead Cloud Engineer
- .NET Full Stack Developer
- .NET Software Engineer
- Data Software Engineer
- Full Stack Java Developer
- Full Stack Software Engineer
- Lead Software Engineer
- Software Automation Engineer
- Java Software Developer
- Python Software Developer
- Full Stack Python Engineer
- Lead Automation Tester
- Lead Data Analyst
- Lead Java Developer
- Software Engineer
- Front End Software Developer
- Lead DevOps Engineer
- Lead Cloud Engineer
- .NET Full Stack Developer
- .NET Software Engineer
- Data Software Engineer