Manage releases of new model versions, including rollback capabilities and integration consistency across components.
Implement performance optimizations such as load management, dynamic batching, and caching.
Develop observability systems including structured logging, tracing, health checks, alerting, and actionable metric dashboards.
Requirements
Must Have:
5+ years of experience in ML engineering, with a focus on audio and voice technologies.
Strong background in Generative AI (LLMs, TTS) and Voice-based Conversational AI.
Proven experience in service & API engineering, streaming implementation, hardware/software compatibility validation, stress testing and latency optimization, model release management and rollback strategies, and integration impact assessment on end-to-end behavior.
Nice to Have:
Experience with performance optimization techniques (load management, batching, caching).
Familiarity with observability tools and practices (logging, tracing, alerting, dashboards).