We are looking for a Data Scientist to join our AI team and support clients in building end-to-end Talk-To-Data (TTD) solutions powered by LLMs and GenAI models. The AI team focuses on cutting-edge aspects of Generative AI and LLMs with applications such as RAG, summarization, multi-agent workflows, and model fine-tuning. With deep, PhD-level expertise in GenAI, NLP, and Computer Vision, this is an excellent environment for professionals eager to push their skills to new heights. We are seeking candidates for long-term engagement in GenAI and related domains.
responsibilities :
Develop end-to-end GenAI applications such as chatbots, voicebots, and Talk-to-Data systems, including data ingestion, retrieval layers, orchestration (e.g., LangChain, LlamaIndex, LangGraph), API/backend, and simple UI where needed.
Design and implement RAG pipelines with vector databases, hybrid search, rerankers, query transformation, and evaluation frameworks for relevance and robustness.
Perform model selection, prompting strategies, and fine-tuning (LoRA/QLoRA/SFT) for text, code, and multimodal models, including guardrails, output evaluation, and A/B testing.
Design, integrate, and optimize LLM interactions with external tools, APIs, and data sources using Model Context Protocol (MCP) connectors.
Understand business requirements and translate them into technical goals, define success metrics, audit data feasibility, and align stakeholder expectations.
Support project delivery and pre-sales initiatives.
requirements-expected :
Solid understanding of deep learning concepts.
Experience in Machine Learning, particularly in Generative AI (LLM/LMM), with focus on NLP or multimodal models.
Experience gathering business requirements and translating them into technical plans, data processing, feature engineering, model evaluation, hypothesis testing, and model deployment.
Strong Python and object-oriented programming skills; working knowledge of SQL and vector databases.
Experience with Azure or GCP cloud platforms.
Knowledge of Deep Learning and GenAI libraries: NumPy, PyTorch, HuggingFace, LangChain, LangGraph, and GenAI APIs (OpenAI, Gemini).
Hands-on experience designing or operating MCP servers/clients for LLM agents.