Wrocław, Wrocław, Lower Silesian Voivodeship, Polska
OVH Sp. z o.o.
25. 12. 2025
Informacje o stanowisku
about-project :
As a Public Cloud SRE specializing in OpenStack, you will be the cornerstone in ensuring the reliability, performance, and scalability of our cloud infrastructure. You will be responsible for maintaining high service availability, implementing rigorous monitoring, and driving continuous improvements in our production environment.
responsibilities :
System Resilience: Architect and maintain key components of our OpenStack environment-including compute, networking, and storage-to guarantee high availability. Monitoring & Alerting: Implement and refine comprehensive monitoring, logging, and alerting systems to rapidly detect and address production issues.
Incident Response: Take charge during outages or service degradations by leading incident management processes and coordinating with cross-functional teams.
Performance Tuning: Analyze system and trends to optimize performance, ensuring the infrastructure scales.
Continuous Improvement: Identify opportunities for process automation and system enhancements, integrating best practices and innovative solutions into daily operations.
Documentation & Standards: Maintain detailed documentation of processes, incident responses, and system architecture to uphold transparency and continuous learning.
requirements-expected :
OpenStack Expertise: In-depth knowledge of OpenStack architecture and hands-on experience managing its core components (Neutro, Nova, Glance, Cinder, Keystone...).
Complex Infrastructure Management: Hands-on experience in managing and optimizing complex IT infrastructures.
Collaborative Mindset: Strong communication skills and the ability to work effectively within cross-functional and remote teams.
SRE Methodologies: Proven expertise in applying SRE practices, including service level objectives (SLOs), error budgets, and incident management.
Advanced Monitoring & Automation: Experience with modern monitoring, logging, and alerting systems as well as proficiency in automating repetitive tasks.
Performance Tuning: Strong analytical skills to interpret system metrics and optimize infrastructure performance.
Language Proficiency: Fluent in English
offered :
4 extra days off
language courses
sport card
meal card
public Transport Refund
partially remote work possible
private health care, life insurance
office in the Wroclaw city center; Kitchen full of different types of tea and coffee; Fresh fruits