.
Machine Learning Engineer @ Acaisoft
  • Warsaw
Machine Learning Engineer @ Acaisoft
Warszawa, Warsaw, Masovian Voivodeship, Polska
Acaisoft
21. 10. 2025
Informacje o stanowisku

You will be cooperating with a leading provider of AI evaluation and optimization solutions, trusted by multinational companies to optimize AI agents and detect performance issues in large language models.
In this role, you’ll help develop advanced reinforcement learning (RL) environments and scalable evaluation systems that guide and shape the behavior of cutting-edge AI models. The company’s mission is to enable safe, verifiable, and aligned AGI through rigorous, real-world agent evaluation.

Due to the client’s time zone, we would appreciate a candidate who can work until 5:00 p.m., or occasionally until 6:00 p.m. If you prefer working slightly later hours, that’s perfectly okay with the client - but it’s not a requirement.

This is a 100% remote position, but if you enjoy working from an office, you’re warmly welcome to join us there too ?


 Join us and make a real impact! 

If you’re ready to broaden your horizons and work with an innovative company at the forefront of AI, we’d love to hear from you. You’ll help build the environments that shape how future AI systems are trained, evaluated, and aligned - and collaborate with world-class engineers and researchers on one of the most important technical challenges of our time. 
                                                    



  • 5+ years of experience in software engineering, simulation systems, data science, or ML infrastructure.
  • Strong command of Python and systems-level programming.
  • Experience designing scalable task pipelines, browser or API simulations (e.g. Playwright, Selenium), or distributed compute frameworks.
  • Understanding of RL concepts - reward modeling, environment dynamics, verifiability, evaluation, and agent interaction loops.
  • Familiarity with instrumentation, metrics, and data pipelines for RL evaluation.
  • Curiosity and conviction around building environments that steer AGI.

You will be cooperating with a leading provider of AI evaluation and optimization solutions, trusted by multinational companies to optimize AI agents and detect performance issues in large language models.
In this role, you’ll help develop advanced reinforcement learning (RL) environments and scalable evaluation systems that guide and shape the behavior of cutting-edge AI models. The company’s mission is to enable safe, verifiable, and aligned AGI through rigorous, real-world agent evaluation.

Due to the client’s time zone, we would appreciate a candidate who can work until 5:00 p.m., or occasionally until 6:00 p.m. If you prefer working slightly later hours, that’s perfectly okay with the client - but it’s not a requirement.

This is a 100% remote position, but if you enjoy working from an office, you’re warmly welcome to join us there too ?


 Join us and make a real impact! 

If you’re ready to broaden your horizons and work with an innovative company at the forefront of AI, we’d love to hear from you. You’ll help build the environments that shape how future AI systems are trained, evaluated, and aligned - and collaborate with world-class engineers and researchers on one of the most important technical challenges of our time. 
                                                    


,[Design and implement RL environments that support large-scale agent evaluation and reinforcement learning experiments. , Build task generation pipelines, dynamic datasets, and scripted environments with controlled complexity and stochasticity. , Develop verifiers and reward models to automatically score trajectories and evaluate model reasoning. , Collaborate with infrastructure and systems engineers to ensure environments are scalable, reproducible, and instrumented for detailed telemetry. , Design APIs and orchestration frameworks for running, resetting, and evaluating agents across environments., Optimize environment performance, logging, and reward reproducibility across distributed setups. Requirements: Python, Machine learning, Reinforcement Learning Additionally: Sport subscription, Private healthcare, Flat structure, Small teams, International projects, Free coffee, Bike parking, Free snacks, Free beverages, Free parking, In-house trainings, Modern office, Startup atmosphere, No dress code.

  • Praca Warszawa
  • Warszawa - Oferty pracy w okolicznych lokalizacjach


    108 393
    18 067