Our products are a set of tools that scan GitHub public activity and git private repositories.
They are used by different teams: Software Development and Ops teams, Application Security, Threat Response and the buying decision comes from CISOs / CTOs / Directors of Security.
By design GitGuardian is a data driven company. Both co-founders are former Data Scientists and the first product of GitGuardian is real-time processing of all new GitHub events. Our secret detection engine has been battle tested against huge amounts of data.
That’s why building data products that provide useful insights of the business is a key responsibility within our organization, your work will matter and will be taken seriously !
Design, build and maintain the company’s central Data Warehouse: infrastructure deployment, sources integration, pipeline development and optimisation, data documentation, data quality monitoring
Enrich the Enterprise Data Model by modeling business entities and events, designed to enable and support the highest levels of accuracy and quality for reporting and analytics
Stay up-to-date with the latest industry trends, technologies, and best practices in Data Engineering and contribute to the overall Data strategy and roadmap
Provide technical leadership, mentorship, and guidance to junior data engineers, including code reviews, best practices, and knowledge sharing
Implement data security and privacy best practices, including data encryption, data masking, and access controls, to protect sensitive data
You will create data features that bring high value to the business
You will be working on a cutting-edge technology for Cloud Data Warehouse
The data ecosystem is very diverse (Amazon RDS for PostgreSQL, Elasticsearch, MongoDB, various SaaS providers)
The Data Team builds and maintains its own infrastructure with high standards in terms of automation and IaC thanks to a close collaboration with the DevOps team
You will be part of a scale-up adventure with a strong engineering culture
Our technical stack
PostgreSQL, Elasticsearch, MongoDB
AWS, Terraform, Docker, Kubernetes