hit counter
Beranda Loker Detail
N
Information Technology 🏢 Full Time ⭐️ Terverifikasi

Senior Site Reliability Engineer (SRE)

Nexus Cloud Systems
San Francisco
Estimasi Gaji
USD 175.000 – USD 220.000
Live Update
2 Juni 2026
Batas Akhir
2 Jun 2027

Deskripsi Pekerjaan

Are you obsessed with uptime, performance, and building resilient systems at scale? Nexus Cloud Systems is seeking a visionary Senior Site Reliability Engineer to join our core infrastructure team in San Francisco. You will play a pivotal role in bridging the gap between development and operations, ensuring our high-traffic microservices architecture remains bulletproof, scalable, and efficient.

You will work alongside elite engineers to shape the future of our cloud-native infrastructure, leveraging cutting-edge tools to automate, monitor, and optimize our distributed systems.

Tanggung Jawab

  • Architect and maintain highly available, scalable cloud infrastructure on AWS and Kubernetes.
  • Automate operational tasks and infrastructure provisioning using Terraform and CI/CD pipelines.
  • Lead incident response efforts, conduct blameless post-mortems, and implement long-term fixes.
  • Implement observability solutions to gain deep insights into system performance and capacity.
  • Collaborate with development teams to embed reliability practices into the software development lifecycle.
  • Optimize cloud resource utilization to balance performance with cost-efficiency.
  • Define and track Service Level Objectives (SLOs) and Error Budgets to ensure a superior user experience.

Kualifikasi

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or equivalent practical experience.
  • 5+ years of experience in SRE, DevOps, or Software Engineering roles.
  • Deep expertise in Kubernetes, Docker, and container orchestration at scale.
  • Proficiency in Go, Python, or Ruby for automation and tool development.
  • Strong background in cloud platforms (AWS preferred) and Infrastructure-as-Code (Terraform/CloudFormation).
  • Experience with monitoring and observability stacks like Prometheus, Grafana, or Datadog.
  • Solid understanding of Linux internals, networking protocols, and distributed system architectures.

Keahlian yang Dibutuhkan

Kubernetes AWS Terraform Go Python SRE Prometheus CI/CD Distributed Systems Observability

Siap Mengambil Tantangan Ini?

Pastikan resume Anda sudah siap. Kirimkan lamaran Anda sekarang sebelum tanggal deadline.

Lamar Sekarang

Lowongan Terkait

Rekomendasi pekerjaan serupa untuk Anda

Lihat Semua