.
Senior Site Reliability Engineer
  • Kraków
Senior Site Reliability Engineer
Kraków, Kraków, Lesser Poland Voivodeship, Polska
AVSystem
13. 12. 2025
Informacje o stanowisku

As a Senior Site Reliability Engineer, you’ll be at the heart of our most critical operations, building the fault-tolerant systems that serve our most important clients: Communications Service Providers (CSP).

We build, test, launch, and operate the complex, high-stakes systems for our global telco customers. Your mission is to ensure the reliability, efficiency, and performance of our core products (like UMP , CEM , BSAP , and DHCP ) across both our cloud and complex on-premise deployments.

This is no small task. Our products handle hundreds of millions of devices in 100+ large deployments worldwide by industry giants like T-Mobile, Play and Vodafone, also via our Cloud offering.

As we embrace more cloud-native and Kubernetes-based deployments, were facing new architectural challenges. We arent just looking for someone to maintain systems; we are looking for an experienced engineer who loves to solve complex operational problems and is passionate about building the automation that will lead us forward.

Requirements

  • 5+ years of professional experience in Site Reliability Engineering, DevOps, or a related role such as Systems or Software Engineering.
  • Advanced proficiency in a high-level language such as Python or Go, with the ability to design and build complex, maintainable automation services and frameworks.
  • Deep expertise with cloud infrastructure (e.g., GCP, AWS, or Azure) and container orchestration (Kubernetes).
  • Proven experience with Infrastructure as Code and configuration management (e.g., Terraform, Ansible).
  • Expert-level understanding of networking (TCP/IP, OSI) and Unix/Linux systems (Ubuntu, RHEL).
  • Expertise in designing, implementing, and managing monitoring and observability tools (e.g., Prometheus, Grafana, Zabbix).
  • A strong sense of ownership, and the ability to lead technical discussions and mentor other engineers.
  • Proficiency in English (B2+).

A huge plus if you have experience with:

  • A formal SRE role in a previous company.
  • Database setup and administration (e.g., MongoDB, Redis).
  • Performance tuning and debugging of JVM-based applications.
  • Building and scaling distributed systems.

Responsibilities

  • Design, build, and maintain complex, maintainable automation services and frameworks to eliminate toil and scale our operations.
  • Proactively identify, debug, and resolve complex performance and reliability issues within our core product codebases.
  • Communicate directly with technical customer teams to troubleshoot, manage, and resolve complex production issues.
  • Lead blameless postmortems and Root Cause Analyses (RCAs) for complex incidents, driving preventative measures.
  • Establish and monitor Service Level Indicators (SLIs) to align the team with availability and latency objectives.
  • Participate in an additionally paid 24/7 on-call rotation, responding to and resolving critical system issues.
  • Mentor junior and mid-level engineers through code reviews, design discussions, and pair programming.
  • Collaborate with development teams on feature design and architecture to ensure reliability, scalability, and operability from the start.
  • Set up and configure software, networks, and operating systems across bare metal, VMs, and cloud/Kubernetes infrastructure.
  • Drive improvements to our monitoring and observability stack (Prometheus, Grafana, Loki) to provide a comprehensive view of system health.

What we offer

  • Freedom and responsibility. Our goal is to inspire people more than manage them. We want our teams to do what is best for our products. This, in turn, generates a sense of responsibility which drives us to do great work.
  • Technical challenges: our customers rely on the reliability of our products to generate revenue in their business. The telco industry is ever-growing and needs us to support that growth.
  • Open-source contribution opportunities.
  • A team of highly skilled and humorous colleagues.
  • Access to the best tools and equipment available in the market.
  • A MacBook Pro / ThinkPad with 2 monitors.
  • Company events and team building activities.
  • Multiple career paths and employee development options – we want you to develop into a tech lead in the future, but we’ll support you in getting another dream role in site reliability, management, product development or sales.
  • Flexible working hours/remote work when you need it
  • Trainings and conferences
  • Multisport card
  • Kitchen full of snacks and treats (including Good Lood ice cream)
  • Car parking area and bike room
  • A relaxed work atmosphere – no dress code, no open space

Come join the best!

Thank you for your interest inSenior Site Reliability Engineerposition.

Havent found a perfect match? Send your CV anyway! Email us atjobs@avsystem.comor just click "Apply."

#J-18808-Ljbffr

  • Praca Kraków
  • Kraków - Oferty pracy w okolicznych lokalizacjach


    165 526
    23 379