hit counter
Beranda Loker Detail
N
Information Technology 🏢 Full Time ⭐️ Terverifikasi

Senior Site Reliability Engineer (SRE)

NexusScale Cloud Infrastructure
San Francisco
Estimasi Gaji
USD 175.000 – USD 225.000
Live Update
2 Juni 2026
Batas Akhir
2 Jun 2027

Deskripsi Pekerjaan

Are you obsessed with uptime, scalability, and performance? NexusScale is looking for a Senior SRE to help build the next generation of our global cloud infrastructure. You will work alongside world-class engineers to automate our systems, optimize our Kubernetes clusters, and ensure our services remain rock-solid for millions of users worldwide.

We value pragmatic engineering, blameless post-mortems, and a proactive approach to technical debt. If you are an automation-first engineer who thrives on solving complex distributed systems problems, we want to hear from you.

Tanggung Jawab

  • Design, implement, and maintain highly available distributed systems on GCP and AWS.
  • Automate infrastructure provisioning and configuration management using Terraform and Ansible.
  • Drive capacity planning, performance analysis, and tuning of our production microservices.
  • Lead incident response efforts and conduct blameless post-mortems to improve system reliability.
  • Implement observability tooling and alerting strategies (Prometheus, Grafana, ELK stack).
  • Mentor junior engineers and promote DevOps best practices across the engineering organization.
  • Optimize cloud spend and resource utilization without compromising system performance.

Kualifikasi

  • 5+ years of experience in Site Reliability Engineering, DevOps, or Software Engineering.
  • Expert-level proficiency with Kubernetes, Docker, and container orchestration at scale.
  • Strong programming skills in Python, Go, or Ruby for automation and tool development.
  • In-depth knowledge of cloud architecture (AWS or GCP) and networking (TCP/IP, DNS, Load Balancing).
  • Experience with Infrastructure as Code (IaC) using Terraform or similar tools.
  • Strong problem-solving skills with the ability to troubleshoot complex production issues under pressure.
  • Excellent communication skills and the ability to collaborate effectively in a remote-friendly environment.

Keahlian yang Dibutuhkan

Kubernetes Go Terraform AWS GCP Prometheus SRE Distributed Systems Automation

Siap Mengambil Tantangan Ini?

Pastikan resume Anda sudah siap. Kirimkan lamaran Anda sekarang sebelum tanggal deadline.

Lamar Sekarang

Lowongan Terkait

Rekomendasi pekerjaan serupa untuk Anda

Lihat Semua