Senior Python Software Engineer for Big Data Retraining program

Are you ready to elevate your Python engineering skills and transition into the exciting field of Big Data? EPAM is offering a unique opportunity where you can secure a position after a single technical interview and gain Big Data expertise without affecting your title or compensation.

This 8-week retraining program is designed to transition Python engineers into the role of Big Data engineer. The curriculum is divided into three key phases: theoretical coursework led by production specialists, hands-on projects or practical tasks, and comprehensive knowledge assessment with feedback. Covering essential topics such as data management, distributed systems, Spark, Kafka, NoSQL databases, and cloud-native services, the program offers a robust foundation in data processing platforms. Training is fully online, allowing participants to engage remotely.

Responsibilities

Explore Big Data and Hadoop: Gain a solid understanding of Big Data concepts, delve into Hadoop’s infrastructure and real-world applications, and learn about data characteristics and deployment trends
Understand DevOps Practices: Familiarize yourself with the basics of DevOps, including continuous integration and continuous deployment (CI/CD), and how these practices streamline software development and operations
Master Data Modeling and Architectures: Learn the essential techniques and levels of data modeling, crucial for Data Engineers and Architects, to effectively manage and interpret complex data structures
Dive into Apache Spark: Deepen your knowledge of Spark with detailed explorations of its architecture, components, and various functionalities including Spark SQL, Spark ML, and Spark Streaming
Harness the Power of Kafka: Understand the fundamentals of Kafka, and explore Kafka Connect and Kafka Streams to manage real-time data feeds and perform stream processing
Leverage Elastic Stack and NoSQL: Get hands-on experience with Elastic Stack for searching, analyzing, and visualizing data in real time, and explore NoSQL databases to manage large volumes of structured, semi-structured, and unstructured data
Implement Data Flow and Pipelining: Learn about data movement essentials and tools like NiFi and Streamsets for effective data collection, flow, and processing
Navigate Orchestration and Scheduling: Understand the role of orchestration in managing complex workflows using tools like Airflow and Jenkins

Requirements

4+ years of production experience in IT
Proficiency in Python, SQL, and cloud platforms (AWS, GCP, Azure)
Experience with tools like Databricks, Spark, Docker, and Kubernetes is a plus
English proficiency at B1+ or above

Looking for something else?

Find a vacancy that works for you. Send us your CV to receive a personalized offer.

Find me a job