Jobiglo

Keine Ergebnisse.

Senior Site Reliability Engineer – Platform Squad

Flip · Allemagne

Neu
Senior 🇬🇧 English
Kubernetes Azure Go Python Pulumi Infrastructure as Code Observability Loki Grafana Tempo Mimir Prometheus ELK SLI SLO Error budgets

Stellenbeschreibung

About the role

As a Senior Site Reliability Engineer in Flip’s Platform Squad, you will own critical reliability domains end‑to‑end, shaping the technical direction of the platform. You will lead architectural decisions, mentor teammates, and continuously raise the reliability bar for a high‑throughput, globally‑scaled SaaS product.

Key responsibilities

  • Co‑own the architecture of Azure cloud infrastructure and Kubernetes clusters, ensuring high throughput and availability.
  • Define and implement a resilience strategy covering global scaling, zero‑downtime deployments, rollbacks, and disaster recovery.
  • Evolve the observability stack (Loki, Grafana, Tempo, Mimir, Prometheus, ELK) into a trusted foundation for engineers.
  • Improve the Infrastructure‑as‑Code platform with Pulumi, reducing toil and enabling self‑service infrastructure.
  • Lead major platform incidents, conduct blameless post‑mortems, and drive systemic improvements.
  • Mentor squad members, run RFCs and design reviews, and help engineers grow into stronger SREs.
  • Partner with the squad to shape the platform roadmap and future direction.

Required profile

  • 5+ years of hands‑on experience as an SRE, Platform Engineer, DevOps Engineer, or similar role.
  • Proven track record building and operating high‑throughput, highly available production systems.
  • Deep production‑level experience with Kubernetes on any hyperscaler (Azure preferred).
  • Strong expertise in modern observability stacks and a clear understanding of SLIs, SLOs, and error budgets.
  • Solid software development skills in Go (preferred) or Python.
  • Hands‑on experience with Infrastructure as Code, preferably Pulumi.

Required skills

  • Kubernetes
  • Azure cloud
  • Go
  • Python
  • Pulumi (IaC)
  • Observability tools: Loki, Grafana, Tempo, Mimir, Prometheus, ELK
  • SLI/SLO management and error budgeting
  • High‑throughput system design
  • Incident management and post‑mortem analysis

Questions fréquentes

Le salaire n'est pas communiqué publiquement par le recruteur. Vous pouvez postuler et négocier directement avec Flip.
Cliquez sur "Postuler maintenant" en haut de la page. Vous pouvez importer votre CV en 1 clic — Jobiglo extrait automatiquement vos informations et postule pour vous.

Warum melden Sie diesen Job?

Danke für Ihre Meldung. Wir prüfen diesen Job.

In 30 Sekunden bewerben

Geben Sie Ihre E‑Mail ein, um sich zu bewerben. Ein Konto wird automatisch erstellt.

Durch das Fortfahren akzeptieren Sie unsere Nutzungsbedingungen.

Sie haben bereits ein Konto? Anmelden

Veröffentlicht vor 1 Woche

Läuft ab in 1 Monat

15 Ansichten · 0 interested

Steigern Sie Ihre Chancen

Laden Sie Ihren Lebenslauf hoch – wir vermitteln Sie an passende Stellen.

Ihr Lebenslauf wird analysiert...

Flip

Allemagne