Skip To Main Content
backBack to Search

Lead AI Engineer, Agentic and RAG Systems

Remote in Georgia, & 4 others
AI Engineering
hot
Looking for something else?

Find a vacancy that works for you. Send us your CV to receive a personalized offer.

Find me a job

We are looking for a seasoned Lead AI Engineer who architects, builds, and operates production GenAI platforms – agentic workflows, RAG pipelines, and LLM-backed services with real users and real SLAs – while leading engineers and setting the technical direction across multiple workstreams.

This is an engineering leadership role, not a research role. The bar is reliability, latency, cost, observability, and safe deployment at scale, with end-to-end ownership from architecture through on-call, and accountability for the technical quality and delivery of the team. Typical workloads include enterprise knowledge platforms, conversational analytics, agentic automation, and LLM-augmented data products.

Responsibilities
  • Own the end-to-end architecture of GenAI platforms across multiple services and teams, defining standards, patterns, and reference implementations
  • Lead the design of agent orchestration (graph/state, conditional routing, tool calling, memory, checkpointing) in LangGraph / LangChain or equivalent, and set best practices for the team
  • Architect production RAG end-to-end: chunking, embeddings, vector stores, hybrid retrieval, reranking, caching, and grounded synthesis – and mentor engineers in building it
  • Drive the design and delivery of Python / FastAPI services – async, SSE streaming, session handling, and structured error contracts – establishing service templates and conventions
  • Define the observability and evaluation strategy (MLflow, OpenTelemetry, or equivalent) for accuracy, cost, and regression across the platform
  • Own the deployment platform on Docker + Kubernetes (EKS/AKS/GKE) with CI/CD, test, eval, and canary gates – setting release standards for AI systems
  • Lead LLM cost engineering strategy – model routing, prompt optimization, caching, token accounting, and build-vs-buy decisions at portfolio level
  • Establish GenAI safety & governance practices: hallucination control, prompt-injection defense, PII handling, and HITL where required
  • Partner with data engineering leadership on semantic layers and pipelines (PySpark / SQL where applicable), and align roadmaps across teams
  • Mentor and grow senior and mid-level engineers through design reviews, pairing, and technical coaching; conduct hiring and technical interviews
  • Represent engineering in conversations with clients, product, and executive stakeholders; translate business goals into technical strategy and delivery plans
Requirements
  • 6+ years in software engineering, with 3+ years shipping production LLM / agentic systems (not POCs or research)
  • 1+ years of experience leading engineers or technical workstreams
  • Proven track record of owning architecture for multi-service GenAI or distributed systems in production
  • Expert-level proficiency in Python and FastAPI (async, REST, SSE)
  • Deep production expertise in LangChain and LangGraph (or equivalent serious production experience with LlamaIndex, AutoGen, or MCP stacks)
  • Strong background in production RAG: embeddings, chunking, and hybrid retrieval with reranking and caching – with the ability to define standards across teams
  • Advanced skills in vector databases such as Pinecone, Weaviate, pgvector, OpenSearch, or Databricks Vector Search
  • Hands-on production experience with at least one major LLM provider – AWS Bedrock (preferred), OpenAI / Azure OpenAI, or Anthropic – including model selection, routing trade-offs, and multi-provider strategy
  • Strong competency in Kubernetes and Docker in real production environments (EKS/AKS/GKE), including platform-level decisions
  • Deep expertise in cloud engineering on AWS, including cost, security, and scalability trade-offs
  • Solid command of observability and tracing tools (MLflow, LangSmith, OpenTelemetry), evaluation harnesses, and latency/cost ownership at platform scale
  • Experience designing and owning CI/CD for AI systems (GitHub Actions, Jenkins, or equivalent) with test/eval gates
  • Demonstrated experience mentoring engineers, leading design reviews, and driving technical decisions across teams
  • Strong written and spoken English (B2+ level); able to lead design discussions, present to senior stakeholders, and influence technical direction with clients and executives
Nice to have
  • Databricks depth – MLflow (tracking & serving), Vector Search, Unity Catalog / Metric Views, PySpark / SQL
  • Experience with LLM fine-tuning – PEFT, LoRA, QLoRA – and the ability to guide build-vs-fine-tune-vs-prompt decisions
  • Strong understanding of MCP servers and tool integration patterns
  • Expertise in GenAI governance & FinOps – auditability, prompt-injection hardening, PII, and token cost in regulated environments
  • Background in classical ML / DL – NLP, BERT-family, time-series, and CV