Skip To Main Content
backBack to Search

Senior Site Reliability Engineer - DevOps

Hybrid in Portugal: Lisbon
Site Reliability Engineering
& 6 others

We are seeking a Senior Site Reliability Engineer to support a global execution platform and deliver high-quality solutions to trading desks and clients.

You will work closely with top specialists, developing your skills in system management, monitoring, and low-latency technology. Apply now to be part of a team driving innovation in financial technology.

Please note that working from the customer's office in Lisbon is required 2-3 days per week.

Responsibilities
  • Develop and implement monitoring, alerting, and incident response strategies
  • Automate routine tasks and processes to improve efficiency
  • Collaborate with software engineering teams to design and deploy reliable, scalable systems
  • Deploy production changes with precision to maintain platform integrity
  • Manage incidents including detailed analysis and reporting to ensure high service levels
  • Participate in on-call rotations to support critical systems and services
  • Communicate effectively with team members to resolve issues promptly
  • Maintain documentation for operational procedures and system configurations
  • Continuously improve system reliability and performance through proactive measures
Requirements
  • Strong knowledge of Unix/Linux systems and networking with 3+ years experience
  • Proficiency in Unix/Linux shell scripting and programming languages such as Python, Perl, C, C++, or Java
  • Experience with monitoring and observability tools like ITRS Geneos, Dynatrace, Prometheus, and Grafana
  • Ability to troubleshoot complex systems and resolve issues efficiently
  • Experience working in high-availability, high-traffic environments
  • Bachelor’s or Master’s degree in IT engineering or related field
  • Ability to work effectively in a team and adapt to new environments
  • Self-motivated with strong problem-solving and issue follow-up skills
  • Excellent written and verbal communication skills with English level B2+
Nice to have
  • Experience with log management tools such as Splunk, ELK, Graylog, or Loki
  • Knowledge of network monitoring tools like Corvil
  • Familiarity with databases including Oracle, PostgreSQL, MySQL/MariaDB, or KDB/q
  • Experience with messaging systems such as IBM MQ, Tibco, Solace, LBM, or Kafka
  • Familiarity with Infrastructure as Code tools like Ansible or Terraform