hit counter
Beranda Loker Detail
N
Information Technology 🏢 Full Time ⭐️ Terverifikasi

Senior Site Reliability Engineer (SRE)

NexusCloud Systems
San Francisco
Estimasi Gaji
USD 175.000 – USD 225.000
Live Update
2 Juni 2026
Batas Akhir
2 Jun 2027

Deskripsi Pekerjaan

Are you obsessed with system uptime, scalability, and performance optimization? NexusCloud Systems is seeking a world-class Senior Site Reliability Engineer to join our high-impact engineering team in San Francisco. You will be responsible for bridging the gap between development and operations, ensuring our global infrastructure remains resilient and performant under massive scale.

We operate a cutting-edge cloud-native stack and believe in automating everything. If you thrive in an environment that values code over manual toil, we want to hear from you.

Tanggung Jawab

  • Design, build, and maintain highly available, scalable cloud infrastructure on AWS.
  • Automate manual operational tasks using Python, Go, or Bash to eliminate toil.
  • Lead incident response and perform blameless post-mortems to improve system reliability.
  • Optimize CI/CD pipelines to ensure rapid, safe, and automated deployment cycles.
  • Monitor system performance using Prometheus, Grafana, and ELK stack to proactively identify bottlenecks.
  • Collaborate with cross-functional software engineering teams to design resilient application architectures.
  • Define and implement Service Level Objectives (SLOs) and Error Budgets to balance reliability with velocity.

Kualifikasi

  • 5+ years of experience in SRE, DevOps, or Systems Engineering roles.
  • Deep expertise in AWS cloud services (EKS, RDS, Lambda, VPC).
  • Proficiency in Infrastructure as Code (IaC) tools such as Terraform or Pulumi.
  • Strong experience with Kubernetes orchestration and containerization (Docker).
  • Solid programming skills in Go, Python, or Ruby for automation and tool building.
  • Strong understanding of Linux internals, networking, and security best practices.
  • Proven ability to troubleshoot complex, distributed system failures in production.

Keahlian yang Dibutuhkan

AWS Kubernetes Terraform Python Go CI/CD Prometheus Grafana Linux Distributed Systems

Siap Mengambil Tantangan Ini?

Pastikan resume Anda sudah siap. Kirimkan lamaran Anda sekarang sebelum tanggal deadline.

Lamar Sekarang

Lowongan Terkait

Rekomendasi pekerjaan serupa untuk Anda

Lihat Semua