Wrocław, Wrocław, Lower Silesian Voivodeship, Polska
OVH Sp. z o.o.
10. 1. 2026
Informacje o stanowisku
about-project :
As a Public Cloud SRE focused on OpenStack, you will be the cornerstone in ensuring the reliability, performance, and scalability of our cloud infrastructure. You will be responsible for maintaining high service availability, implementing monitoring and driving continuous improvements in our production environment.
responsibilities :
System Resilience : develop & maintain key components of our OpenStack environment-including compute, networking and storage-to guarantee high availability.
Monitoring & Alerting: implement and refine comprehensive monitoring, logging, and alerting systems to rapidly detect and address production issues.
Incident response: manage outages or service degradations by leading incident management processes and coordinating with cross-functional teams.
Performance tuning: analyze system and trends to optimize performance, ensuring the infrastructure scales.
Continuous improvement: identify opportunities for process automation and system enhancements, integrating best practices and innovative solutions into daily operations.
Documentation & Standards: maintain detailed documentation of processes, incident responses, and system architecture to uphold transparency and continuous learning.
requirements-expected :
SRE Methodologies: expertise in applying SRE practices, including service level objectives (SLOs), error budgets, and incident management.