Site Reliability Engineer (SRE) - Based in London
We’re currently looking for Site Reliability Engineers (SREs) to join our Platform team.
We’re looking for SREs who are software engineers at heart - you are as comfortable writing software to solve problems as you are operating AWS or Kubernetes.
If you’re a software engineer who has some good cloud infrastructure experience already, or you’re eager to get really familiar with systems, tooling and libraries, this could be the role for you.
As a team, we’re responsible for designing, building, and operating the services we consume from AWS, along with the software we run on top like Kubernetes, Kafka, Redis, PostgreSQL and more. We’re also responsible for operating our network, and being on-call for the things we own and run.
To achieve this, we’re organised into three teams within the Platform Universe; Platform Engineering, Data Engineering, and Operations. Each squad is responsible for solving a specific set of problems for our customers and our engineers.
We’re looking for engineers who are interested in joining our Operations SRE squad.
We're investing a lot in modernizing our platform and moving to a more sustainable architecture. Help us build a state-of-the-art microservices platform to power the future’s biggest brands.
- Design and implement automation tools and frameworks to streamline our operations and deployment processes. This will involve creating new tools as well as improving existing ones.
- Participate in architecture and design reviews to ensure that our systems are scalable, reliable, and secure. You will be working with other engineers to make sure that our systems are designed and built for the long term.
- Build, maintain and continuously improve our monitoring, alerting, and logging systems. This includes setting up new tools and constantly finding ways to improve our existing ones.
- Identify and troubleshoot production issues and provide quick resolution. You will be responsible for identifying problems and finding solutions, as well as working with other teams to ensure that they are resolved quickly.
- Collaborate with development teams to ensure that our systems are designed and built for reliability and scalability. You will be working with other teams to make sure that our systems are designed and built to be robust and scalable.
- Monitor and report on system performance and availability. You will be responsible for monitoring our systems to ensure that they are performing well and are available to our users.
At MangoPay you will get to work with a lot of exciting new technology.
We rely heavily on the following tools and technologies:
- AWS Cloud
- Kubernetes (EKS)
- Git (Gitlab)
- TeamCity, Octopus Deploy
- Net Core, .Net 4.8 (migration .Net 6)
- Entity Framework
- GraalVM (native compilation)
- RabbitMQ (MassTransit)
- SQL Server
- Kafka (MSK)
Trouve un emploi 100% en télétravail
En t'inscrivant à RemoteFR tu reçois des offres d'emploi 100% remote toutes les semaines dans ta boîte mail. Plus besoin de passer des heures à chercher sur les sites d'annonces, je le fais pour toi!