Description
NMI is building a mature, product-oriented Reliability Engineering function, and were looking for a Staff Software Engineer to play a key role in that evolution.This role sits on the Reliability Engineering team, which focuses on improving the reliability, performance, and operational maturity of critical platform services. The teams mission is to move the engineering organization from reactive incident response toward intentional, engineered reliability through strong systems, tooling, and standards.As a Staff Engineer, you will operate beyond a single service or codebase, designing and building reliability frameworks, platform capabilities, and guardrails that improve uptime, observability, and operational confidence. This is a highly hands-on role with strong expectations around technical leadership, ownership, and delivery.Key responsibilities:
- Design and build reliability-focused frameworks, tooling, and standards that improve platform uptime, performance, and operational confidence.
- Drive initiatives that move reliability from reactive response to proactive engineering, emphasizing prevention, early detection, and fast recovery.
- Partner with engineering teams to embed reliability into system design, development practices, and deployment workflows.
- Establish and evolve observability practices, including metrics, logging, alerting, and dashboards that enable clear operational insight.
- Identify systemic risks and failure patterns, and lead efforts to address them through automation, architectural improvements, and process refinement.
- Contribute hands-on to production codebases, internal tools, and platform services with a focus on long-term maintainability.
- Influence technical direction across teams through design reviews, technical proposals, and clear written communication.
- Improve operational maturity through better incident practices, post-incident learning, and continuous improvement loops.
- Mentor engineers by modeling strong ownership, technical judgment, and disciplined delivery.
- Participate in on-call rotations, with a clear mandate to reduce operational load over time through engineering.
Skills and experience: - 8+ years of experience building and operating production-grade software systems in complex environments.
- Strong experience
5 autres jobs qui pourrait t'intéresser:
- 🌐 Senior Staff Software Engineer
- 🌐 Senior Software Development Engineer DevOps
- 🌐 Senior Cloud Infrastructure Engineer Kubernetes
- 🌐 Principal Engineer Pulumi Neo
- 🌐 Security Infrastructure Engineer
Mon top 5 du matériel pour télétravailler

Obtiens 10x plus d'entretiens d'embauche grâce à l'automatisation des candidatures avec l'IA
En fonction de tes critères de recherche, postule automatiquement jusqu'à 1 500 offres d'emploi chaque mois
