We’re looking for an MLOps Engineer for our US-based client developing a platform that supports users emotional well-being. The solution connects people in real-time, moderated group chats based on natural language input and AI-driven matching. Using a multi-patented approach, it identifies user needs and delivers hyper-relevant health resources.
responsibilities :
Deploy and manage machine learning models on Kubernetes clusters.
Develop robust data pipelines for training, inference, and analytics purposes.
Monitor and manage cluster infrastructure using Prometheus, Thanos, and other observability tools.
Implement and maintain model observability frameworks using tools like Arize.
Implement A/B testing strategies and software, such as Istio, to evaluate model performance and reliability.
Develop and maintain dashboards and alerting systems to track model performance and data quality metrics.
Collaborate with data scientists and ML engineers to ensure models are robustly monitored, and issues are quickly identified and resolved.
Ensure the scalability and reliability of model deployment pipelines using Kubernetes.
Stay up to date with the latest advancements in model observability and infrastructure monitoring technologies.
requirements-expected :
Proven experience in deploying and monitoring machine learning models in production environments.
5+ years of experience working with Docker, Kubernetes, Helm, and CI/CD pipelines and best practices.
5+ years of experience with observability tools such as Prometheus, Thanos, and Grafana.
Familiarity with model monitoring tools such as Arize, Evidently AI, and Alibi Detect.
Experience with A/B testing and service mesh software such as Istio.
Proficiency in using platforms like Kubeflow and OpenDataHub for model deployment and management.
Strong understanding of infrastructure monitoring and observability best practices.
Excellent problem-solving skills and the ability to troubleshoot complex issues.
Experience with cloud platforms such as AWS, Google Cloud Platform (GCP), or Azure.
Knowledge of scripting and automation tools (e.g., Bash, Python).
offered :
Work environment with zero micromanagement – we cherish autonomy.
100% remote work (unless you want to work from our HQ Gdynia), recruitment & onboarding.
Really cool seaside apartment available for free for both leisure & work.
Unique memes channel.
Private medical insurance and Multisport.
We want you to join our team. We are neither the agency giving you projects from time to time, nor huge corporation where you are a “dev XYZ”. At Idego – you matter!