hit counter
Beranda Loker Detail
C
Information Technology 🏢 Full Time ⭐️ Terverifikasi

Senior Site Reliability Engineer (SRE)

CloudScale Innovations
San Francisco
Estimasi Gaji
USD 175.000 – USD 225.000
Live Update
2 Juni 2026
Batas Akhir
2 Jun 2027

Deskripsi Pekerjaan

Are you obsessed with uptime, system performance, and building resilient infrastructure? CloudScale Innovations is seeking a Senior Site Reliability Engineer to join our core platform team. You will be the architect of our reliability strategy, ensuring that our globally distributed cloud services remain stable, scalable, and secure. We are looking for an engineer who thrives in the intersection of software development and systems operations.

You will play a pivotal role in evolving our infrastructure-as-code practices and automating the operational lifecycle of our mission-critical applications.

Tanggung Jawab

  • Architect and maintain highly available, scalable cloud infrastructure using Terraform and Kubernetes.
  • Lead incident response efforts and conduct blameless post-mortems to improve system resilience.
  • Implement observability and monitoring solutions to ensure deep visibility into system performance (Prometheus, Grafana, ELK).
  • Automate manual operational tasks through robust CI/CD pipelines and scripting (Python/Go).
  • Collaborate with product and engineering teams to define and meet ambitious Service Level Objectives (SLOs).
  • Mentor junior engineers and champion SRE best practices across the organization.
  • Optimize cloud resource utilization to balance performance with cost efficiency.

Kualifikasi

  • Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience.
  • 5+ years of experience in SRE, DevOps, or Systems Engineering roles.
  • Deep expertise in managing large-scale Kubernetes clusters in production (EKS/GKE).
  • Proficiency in at least one modern programming language (Go, Python, or Java).
  • Expert-level knowledge of AWS or GCP cloud services and networking fundamentals.
  • Strong background in IaC (Terraform, CloudFormation, or Pulumi).
  • Proven ability to troubleshoot complex distributed systems in high-traffic environments.

Keahlian yang Dibutuhkan

Kubernetes Terraform AWS Go Python Observability CI/CD Distributed Systems GCP

Siap Mengambil Tantangan Ini?

Pastikan resume Anda sudah siap. Kirimkan lamaran Anda sekarang sebelum tanggal deadline.

Lamar Sekarang

Lowongan Terkait

Rekomendasi pekerjaan serupa untuk Anda

Lihat Semua