hit counter
Beranda Loker Detail
C
Information Technology 🏢 Full Time ⭐️ Terverifikasi

Senior Site Reliability Engineer (SRE)

CloudScale Dynamics
San Francisco
Estimasi Gaji
USD 175.000 – USD 220.000
Live Update
2 Juni 2026
Batas Akhir
2 Jun 2027

Deskripsi Pekerjaan

Are you obsessed with system uptime, performance at scale, and the elegance of automated infrastructure? CloudScale Dynamics is looking for a Senior Site Reliability Engineer to join our high-impact engineering team in San Francisco. You will be the architect of our reliability strategy, bridging the gap between development and operations to ensure our global platform remains resilient, performant, and secure.

You will work alongside elite software engineers to define error budgets, lead incident responses, and implement cutting-edge observability solutions. If you thrive in high-stakes environments and love solving complex distributed systems problems, we want to hear from you.

Tanggung Jawab

  • Design, build, and maintain scalable, high-availability cloud infrastructure on AWS/GCP.
  • Drive capacity planning and performance tuning for high-traffic microservices.
  • Lead post-mortem analysis and implement long-term fixes to prevent recurrence of system incidents.
  • Develop automation tools to manage infrastructure-as-code (Terraform) and CI/CD pipelines.
  • Implement advanced monitoring, logging, and tracing solutions (Prometheus, Grafana, ELK).
  • Champion 'SRE best practices' across engineering squads, including code reviews and architectural audits.
  • Participate in a collaborative on-call rotation to ensure 99.99% service availability.

Kualifikasi

  • Bachelor’s degree in Computer Science or equivalent practical experience.
  • 5+ years of experience in SRE, DevOps, or Systems Engineering roles.
  • Expertise in Linux system internals and networking (TCP/IP, DNS, HTTP, TLS).
  • Advanced proficiency in at least one language: Go, Python, or Java.
  • Deep understanding of container orchestration platforms, specifically Kubernetes.
  • Proven experience with IaC tools such as Terraform or Pulumi.
  • Strong problem-solving skills and the ability to remain calm under pressure during outages.

Keahlian yang Dibutuhkan

AWS Kubernetes Terraform Go Python Observability Distributed Systems Linux

Siap Mengambil Tantangan Ini?

Pastikan resume Anda sudah siap. Kirimkan lamaran Anda sekarang sebelum tanggal deadline.

Lamar Sekarang

Lowongan Terkait

Rekomendasi pekerjaan serupa untuk Anda

Lihat Semua