Back to Search
We are seeking a remote Lead Java Developer with expertise in Apache Beam and Google Cloud Platform to join our team. You will be responsible for designing and implementing robust data pipelines using Java and Apache Beam in the Google DataFlow environment to process large volumes of data efficiently. You will work with the Spring Framework to develop microservices that support our pipelines. Utilizing Google Cloud Platform services, you will deploy and manage data pipelines, ensuring high availability and reliability.
Responsibilities
- Design and implement robust data pipelines using Java and Apache Beam in the Google DataFlow environment to process large volumes of data efficiently
- Lead and manage the team of experienced engineers
- Work with the Spring Framework. It is microservices that support our pipelines
- Utilize Google Cloud Platform services to deploy and manage data pipelines, ensuring high availability and reliability
- Collaborate with data scientists and analysts to understand data requirements and implement solutions that support data modeling, mining, and extraction processes
- Optimize existing data pipelines for performance and scalability, identify bottlenecks, and implement improvements
- Develop and maintain documentation for data pipeline architectures, design decisions, and operational procedures
- Ensure data integrity and compliance with data governance and security policies throughout the data processing lifecycle
- Monitor pipeline performance and implement logging and alerting mechanisms to detect and address issues proactively
- Stay updated with the latest data processing technologies and framework advancements, exploring new tools and practices to enhance pipeline efficiency and functionality
Requirements
- Bachelor's or Master's degree in Computer Science, Engineering, Information Technology, or a related field
- At least 5 years of experience in data pipeline development, with a strong background in Java and cloud-based data processing technologies
- 1+ year of relevant leadership experience
- Experience with GCP services, particularly those related to data storage, processing, and analytics. Knowledge of GCP's infrastructure and security best practices
- Proven track record of designing, implementing, and optimizing data pipelines for processing large data sets in a cloud environment
- Hands-on experience with Google DataFlow (must have) and Apache Beam for building and managing data pipelines. Familiarity with the principles of parallel processing and distributed computing as applied to data processing
- Solid understanding of the Spring Framework, including Spring Boot, for building high-performance applications
- Proficiency in using version control systems, such as Git, for code management and collaboration
- Excellent analytical, problem-solving, and communication skills
- Ability to work in a fast-paced, collaborative environment
- B2+ English level proficiency
Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn