Skip To Main Content
backBack to Search

Lead Site Reliability Engineer (SRE)/DevOps

Remote in Argentina, Mexico
Site Reliability Engineering& 4 others
Looking for something else?

Find a vacancy that works for you. Send us your CV to receive a personalized offer.

Find me a job

We are building a resilient cloud platform and need a Lead Site Reliability Engineer (SRE)/DevOps to drive stability, scale, and operational excellence. You will blend software engineering with systems expertise to run large, distributed, fault-tolerant services and strengthen reliability practices across teams. Apply now to help raise availability, performance, and automation across production

Responsibilities
  • Design, build and maintain infrastructure and tooling that enables fast software development and reliable releases
  • Ensure continuous availability, performance and scalability of production systems and services
  • Implement automation tools to streamline operations and improve response to alerts and incidents
  • Collaborate with the development team to enhance system reliability and optimize performance
  • Create and maintain operational documentation and specifications for system builds and operating procedures
  • Monitor and report on service level objectives for a given application's services
  • Define key performance indicators in cooperation with business and product owners
  • Promote a culture of continuous improvement, testing and automation
Requirements
  • Bachelor's or Master's degree in Computer Science, Information Technology or related field
  • Proven track record with 5+ years of experience in an SRE/DevOps role scaling and automating large-scale systems
  • Solid understanding of cloud computing services, preferably AWS, Azure or GCP
  • Hands-on experience with scripting languages such as Python and Bash and infrastructure as code tools such as Terraform and CloudFormation
  • Strong skills with container orchestration tools such as Kubernetes and Docker
  • Working knowledge of CI/CD pipelines and tools such as Jenkins and GitLab CI
  • Practical familiarity with monitoring and alerting tools such as Prometheus, Grafana and New Relic
  • Excellent leadership and communication skills
  • English proficiency at B2 level or higher