Wrocław, Wrocław, Lower Silesian Voivodeship, Polska
DCG
2. 12. 2024
Informacje o stanowisku
technologies-expected :
Google Cloud Platform
Prometheus
Datadog
New Relic
Ansible
Puppet
Chef
responsibilities :
Team Leadership:
- Recruit, hire, and develop a high-performing SRE team
- Provide mentorship and coaching to junior FTE SRE engineers within clients teams
- Set clear goals and expectations for the team
- Develop and track Objectives and Key Results (OKRs) for the team as a whole and individual team members
Technical Strategy:
- Define and implement SRE best practices, standards, and processes
- Own end-to-end availability and performance of key services and build automation to prevent problem recurrence
- Deliver end-to-end automation using Terraform within Google Cloud to create a new project, add a user to an existing project, request access to a new service, enable a new google service in an existing project, etc.
- Expert knowledge of IAM and roles and permissions within Google Cloud
- Design various user roles considering both security and the user experience
- Design and build monitoring systems to identify potential issues proactively
- Establish Service Level Objectives (SLOs) for all offered services
-Manage expenses to the budget in the cloud
Collaboration:
- Come with a customer obsessed attitude and create a seamless user experience for any team requesting infrastructure services
- Work closely with development teams to identify potential reliability issues early in the development cycle
- Collaborate with security teams to maintain system security and compliance
- Excellent written and verbal communication skills
Performance Optimization:
- Analyze system metrics to identify performance bottlenecks and opportunities for improvement
- Implement capacity planning strategies to ensure system resilience under high load
Continuously monitor and optimize system performance
requirements-expected :
8 years of experience with data structures or algorithms
5 years of experience with software development in one or more programming languages
3 years of experience managing people or teams, leading projects, and designing, analyzing, and troubleshooting distributed systems
Strong understanding of software development lifecycle (SDLC) and DevOps principles
Deep technical expertise in cloud computing platforms (GCP is a leading platform, but some services are hosted in Azure)
Proven experience with monitoring tools (Prometheus, Datadog, New Relic)
Experience with automation frameworks (Ansible, Puppet, Chef)
Bachelor’s degree in Computer Science, a related field, or equivalent practical experience