Senior AI Machine Learning Engineer (Agentic Systems & LLM Evaluation)

Machine Learning Engineering

We are seeking a highly skilled Senior AI Machine Learning Engineer to lead the development of intelligent agentic systems and drive the evaluation of large language models (LLMs). In this role, you will collaborate with stakeholders to design and implement powerful agents, design robust evaluation frameworks, and contribute to advancing the effectiveness and reliability of AI systems.

Responsibilities

Build agentic systems tailored to various Software Development Lifecycle (SDLC) personas, translating requirements into actionable implementations
Partner closely with cross-functional teams to understand workflows and ensure alignment with stakeholder needs
Design and implement robust evaluation test cases for LLMs using LLMs as judges and deterministic methodologies
Evaluate model performance and ensure reliability using tools such as LangSmith, LangFuse, or MLFlow
Develop and fine-tune prompt frameworks to align LLMs with diverse agent requirements
Utilize programming expertise in Python and AI/ML frameworks to enhance agentic systems' capabilities
Monitor, debug, and optimize AI systems’ performance using advanced testing and automation practices
Leverage modern SDKs like LangGraph or similar frameworks to efficiently build intelligent workflows
Foster continuous improvement by incorporating feedback, testing outcomes, and insights into future iterations
Act as a thought leader on AI testing standards, governance, and best practices in agentic system development

Requirements

3+ years of experience in AI engineering, with emphasis on LLMs and agentic systems
Knowledge of modern testing standards and practices applied to agentic systems
Strong programming and development skills in Python, with usage of AI/ML frameworks and libraries
Familiarity with evaluation and monitoring tools such as LangSmith, LangFuse, or MLFlow
Competency in using SDKs like LangGraph or similar frameworks for agentic system development
Understanding of LLM concepts, including prompt engineering and fine-tuning methodologies
Knowledge of testing frameworks and automation tools for validating AI systems
Analytical and problem-solving skills, with a focus on delivering reliable and effective solutions
Proficiency in communication and collaboration, ensuring successful teamwork across roles
Capability to work independently, delivering high-quality AI solutions in a dynamic environment
Excellent written and spoken English (B2+ level)

Nice to have

Experience writing agentic systems using SDKs like LangGraph or similar technologies
Familiarity with evaluation tools such as LangSmith, LangFuse, or MLFlow
Understanding of AI governance, safety, and ethical considerations in agentic systems
Showcase of enterprise-grade AI solutions with multiple integrations and workflows

Looking for something else?

Find a vacancy that works for you. Send us your CV to receive a personalized offer.

Find me a job