hit counter
Beranda Loker Detail
N
Information Technology 🏢 Full Time ⭐️ Terverifikasi

Senior Site Reliability Engineer (SRE)

Nexus Cloud Infrastructure
San Francisco
Estimasi Gaji
USD 175.000 – USD 225.000
Live Update
2 Juni 2026
Batas Akhir
2 Jun 2027

Deskripsi Pekerjaan

Are you obsessed with system uptime, latency, and building highly scalable distributed systems? Nexus Cloud Infrastructure is seeking a Senior Site Reliability Engineer to join our core platform team in San Francisco. You will be the bridge between development and operations, ensuring our global cloud infrastructure remains resilient, performant, and secure.

We operate at a massive scale, and we value engineers who think in terms of automation, observability, and systematic problem solving. If you are passionate about pushing the boundaries of what is possible in cloud-native environments, we want to meet you.

Tanggung Jawab

  • Design and maintain highly available, fault-tolerant infrastructure on AWS and Kubernetes.
  • Automate operational tasks using Go, Python, or Terraform to reduce manual toil.
  • Lead incident response, root cause analysis, and post-mortem discussions for production outages.
  • Improve system observability through advanced logging, distributed tracing, and real-time monitoring.
  • Develop and manage CI/CD pipelines to streamline code deployment and infrastructure provisioning.
  • Collaborate with software engineering teams to optimize application performance and architecture.
  • Define and implement Service Level Objectives (SLOs) and Error Budgets to balance reliability with velocity.

Kualifikasi

  • Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience.
  • 5+ years of experience in SRE, DevOps, or Systems Engineering roles.
  • Expert-level proficiency with AWS, Kubernetes, and Docker orchestration.
  • Deep understanding of IaC tools like Terraform or CloudFormation.
  • Strong coding skills in Python, Go, or Ruby for automation and tool development.
  • Experience with monitoring stacks such as Prometheus, Grafana, Datadog, or ELK.
  • Deep knowledge of networking protocols (TCP/IP, HTTP/S, DNS) and load balancing strategies.
  • Excellent communication skills with the ability to lead cross-functional technical initiatives.

Keahlian yang Dibutuhkan

AWS Kubernetes Terraform Go Python Observability CI/CD Distributed Systems GCP

Siap Mengambil Tantangan Ini?

Pastikan resume Anda sudah siap. Kirimkan lamaran Anda sekarang sebelum tanggal deadline.

Lamar Sekarang

Lowongan Terkait

Rekomendasi pekerjaan serupa untuk Anda

Lihat Semua