.
Machine Learning Engineer
  • Warsaw
Machine Learning Engineer
Warszawa, Warsaw, Masovian Voivodeship, Polska
ACAISOFT POLAND Sp. z o.o.
28. 11. 2025
Informacje o stanowisku

technologies-expected :


  • Python
  • Playwright
  • Selenium

about-project :


  • You will be cooperating with a leading provider of AI evaluation and optimization solutions, trusted by multinational companies to optimize AI agents and detect performance issues in large language models.
  • In this role, you’ll help develop advanced reinforcement learning (RL) environments and scalable evaluation systems that guide and shape the behavior of cutting-edge AI models. The company’s mission is to enable safe, verifiable, and aligned AGI through rigorous, real-world agent evaluation.
  • Due to the client’s time zone, we would appreciate a candidate who can work until 5:00 p.m., or occasionally until 6:00 p.m. If you prefer working slightly later hours, that’s perfectly okay with the client - but it’s not a requirement.
  • This is a 100% remote position, but if you enjoy working from an office, you’re warmly welcome to join us there too ?

responsibilities :


  • Design and implement RL environments that support large-scale agent evaluation and reinforcement learning experiments.
  • Build task generation pipelines, dynamic datasets, and scripted environments with controlled complexity and stochasticity.
  • Develop verifiers and reward models to automatically score trajectories and evaluate model reasoning.
  • Collaborate with infrastructure and systems engineers to ensure environments are scalable, reproducible, and instrumented for detailed telemetry.
  • Design APIs and orchestration frameworks for running, resetting, and evaluating agents across environments.
  • Optimize environment performance, logging, and reward reproducibility across distributed setups.

requirements-expected :


  • 5+ years of experience in software engineering, simulation systems, data science, or ML infrastructure.
  • Strong command of Python and systems-level programming.
  • Experience designing scalable task pipelines, browser or API simulations (e.g. Playwright, Selenium), or distributed compute frameworks.
  • Understanding of RL concepts - reward modeling, environment dynamics, verifiability, evaluation, and agent interaction loops.
  • Familiarity with instrumentation, metrics, and data pipelines for RL evaluation.
  • Curiosity and conviction around building environments that steer AGI.

offered :


  • Great atmosphere - we value a friendly, informal atmosphere, and direct contact with everyone in the company.
  • Outstanding People - we understand that great teams are about personalities, not just skills. Therefore our team accommodates a fantastic blend of individuals and management that removes roadblocks.
  • Modern technologies - we use proven technologies that are currently up-to-date. Even if you have not used all of them, you can make up for it with us!
  • Unlimited possibilities - you’ll get the opportunity to develop your qualifications thanks to sponsorship for industry meetups and conferences and working on challenging international projects with the latest technologies.
  • Private medical care and Multisport - we care about your health and wellbeing so you’ll get access to private medical care for you and your family, and partial funding for a sports card.

benefits :


  • sharing the costs of sports activities
  • private medical care
  • sharing the costs of professional training & courses
  • remote work opportunities
  • flexible working time
  • integration events
  • corporate sports team
  • no dress code
  • video games at work
  • coffee / tea
  • drinks
  • parking space for employees
  • leisure zone
  • extra social benefits
  • baby layette
  • school layette
  • employee referral program
  • charity initiatives
  • company sports team
  • Gift vouchers for kids (birthdays, Christmas, Childs Day)

  • Praca Warszawa
  • Warszawa - Oferty pracy w okolicznych lokalizacjach


    110 848
    15 050