Skip To Main Content
backBack to Search

Lead DevOps Engineer

Remote in Argentina, & 4 others
DevOps& 15 others
Looking for something else?

Find a vacancy that works for you. Send us your CV to receive a personalized offer.

Find me a job

We are seeking a Lead DevOps Engineer to design, operate, and continuously improve the AWS platform that powers a custom VDI platform and cloud playtesting/streaming platform. This is a primarily individual contributor role that requires strong ownership and the ability to work independently while collaborating with one other team member and customer stakeholders. You will be responsible for infrastructure-as-code, container platforms, automation, CI/CD standardization, cost/performance optimization (including GPU instances), and leading troubleshooting during platform-wide degradations.

Responsibilities
  • Design, build, and maintain AWS infrastructure using Terraform
  • Management of Terraform workflows and remote state using HashiCorp Cloud Platform (HCP)
  • Ownership of the infrastructure lifecycle including provisioning, upgrades, decommissioning and operational hygiene
  • Operation of ECS clusters to deploy and operate microservices supporting the platforms
  • Operation of EKS clusters used to host and enable GitHub Actions runners, including required platform customizations
  • Right-size and tune GPU-enabled EC2 capacity to balance user experience with strict cloud cost controls
  • Continuous assessment of scaling behavior, utilization and performance bottlenecks
  • Implementation and maintenance of AWS Lambda functions for automation such as cleanup tasks, on-demand provisioning and operational workflows
  • Standardize and optimize GitHub Actions pipelines for Terraform plan/apply workflows, infrastructure releases and container image build/publish/deploy processes
  • Lead troubleshooting and restoration efforts for platform-wide issues such as VDI session drops, authentication issues and machine/storage failures
  • Coordination of incident resolution across teams through investigation, mitigation and follow-up actions
  • Creation and maintenance of run books, operational documentation and onboarding materials
Requirements
  • 5+ years of experience in DevOps or platform engineering roles
  • Expertise in AWS infrastructure design, provisioning and lifecycle management
  • Proficiency in Terraform and HashiCorp Cloud Platform (HCP)
  • Skills in container orchestration with ECS and EKS
  • Knowledge of GPU-enabled EC2 capacity right-sizing, cost management and performance tuning
  • Competency in AWS Lambda for event-driven automation
  • Background in CI/CD standardization with GitHub Actions pipelines
  • Capability to lead reliability engineering, troubleshooting and incident resolution
  • High ownership and accountability with the ability to work independently and deliver without close supervision
  • Strong troubleshooting and systems thinking, remaining calm and structured during incidents
  • Clear communication with both technical and non-technical stakeholders
  • Practical prioritization in a Kanban environment balancing planned work and urgent interruptions
  • English proficiency at B2 level or higher
Nice to have
  • Familiarity with Amazon GameLift Streams
  • Understanding of streaming and playtesting platform needs
  • Skills in triaging urgent ad-hoc requests outside the standard Kanban flow