As a Site Reliability Engineer, you will be working for our client, a leading financial institution heavily investing in Agile culture, DevOps processes, and Cloud Technologies. The new development team in Krakow, part of a long-term strategy to support a European platform, offers an exciting opportunity to contribute to the foundational stages of a critical project. This role involves ensuring system reliability, availability, and performance while supporting a dynamic, high-impact environment.
Join us, and ensure seamless system performance every day!
Krakow-based opportunity with the possibility to work 80% remotely!
Internal number #5627
responsibilities :
Managing application support operations, focusing on resiliency, availability, and monitoring system health and performance
Coordinating resolution of production incidents, conducting post-mortem/RCA to identify root causes and improve processes
Investigating, triaging, and resolving production incidents with a focus on technical signals and root cause analysis
Documenting post-incident recovery steps, contributing to process improvements, identifying deviations, and creating a Knowledge Base
Actively participating in the service management community, engaging in Incident Management, Problem Management, and Service Delivery
Defining and delivering tactical and strategic service improvements across the technical and process landscape
Applying SRE principles to continuously improve platform reliability, capacity, and performance, reducing toil and enhancing observability
Developing observability tools and techniques for monitoring, alerting, incident detection, response, capacity management, and release safety
requirements-expected :
4+ years of experience in developing, supporting distributed systems written in Java
Experience with Disaster Recovery methods and processes
A methodical approach to troubleshooting and problem-solving skills
Experience implementing and managing Logging, Monitoring, and Alerting framework for hybrid cloud using tools such as Geneos, Grafana, InfluxDB, Splunk, Loki or any other similar tools
Understanding of RDBMS Database, Cloud Technology, Unix/Linux, Job scheduling e.g., Control-m or Autosys
Ability to lead technical conversations with various technical support groups
Excellent communication skills and experience working in Agile methodology
offered :
Stable and long-term cooperation with very good conditions
Enhance your skills and develop your expertise in the financial industry
Work on the most strategic projects available in the market
Define your career roadmap and develop yourself in the best and fastest possible way by delivering strategic projects for different clients of ITDS over several years
Participate in Social Events, training, and work in an international environment
Access to attractive Medical Package
Access to Multisport Program
Access to Pluralsight
Flexible hours & remote work
benefits :
sharing the costs of sports activities
private medical care
flexible working time
fruits
integration events
corporate gym
mobile phone available for private use
computer available for private use
saving & investment scheme
no dress code
coffee / tea
drinks
christmas gifts
birthday celebration
sharing the costs of a streaming platform subscription