LLM Data Scientist

Warszawa

Nazwa pozycji LLM Data Scientist

Lokalizacja Warszawa, Warszawa, Województwo mazowieckie, Polska

Firma Link Group

Dodano 10. 10. 2024

Informacje o stanowisku

Project Overview:

The GenAI CDD Support project focuses on augmenting the Customer Due Diligence (CDD) review process using generative AI solutions. The customer is currently developing risk summarization capabilities at the end of a CDD review. The project aims to scale up this solution by implementing development pipelines for new segments or countries, refining risk summaries, and delivering a fully functional end-to-end pipeline. The solution will leverage Google Cloud Platform (GCP) services, including Vertex AI, BigQuery, Cloud Run, Cloud Build, and more.

Role Overview:

We are seeking an experienced LLM Data Scientist to design, develop, and deploy next-generation agent-based software systems as part of this GenAI project. The ideal candidate will have strong expertise in prompt engineering, LLM fine-tuning, and inference tuning, and will work closely with cross-functional teams to address risks, ethical concerns, and the technical challenges of deploying AI solutions in production environments.

Key Responsibilities:

Prompt Engineering: Design and optimize prompts to generate qualitative risk summaries for various CDD segments and countries.
ML Pipeline Adjustments: Modify machine learning pipelines to suit new segments and improve performance in risk summarization.
LLM Evaluation: Conduct LLM evaluations focused on knowledge, alignment, safety, and applicability.
Information Retrieval & RAG: Work on retrieval-augmented generation (RAG) techniques to enhance information retrieval.
LLM Fine-Tuning: Perform fine-tuning of large language models using efficient strategies like LoRA, QLoRA, and other techniques.
Cloud Integration: Implement the solution on GCP, using tools like BigQuery, Vertex AI, Cloud Run, Cloud Build, and Artifact Registry for data processing and model deployment.
Collaboration: Work with cross-functional teams to ensure that AI models are ethically sound and meet the desired performance standards.

Must-Have Skills:

Prompt Engineering: Expertise in designing prompts to optimize AI outcomes.
AI Performance Analysis: Strong ability to analyze AI performance and improve model outputs.
LLM Evaluation: Knowledge and experience in evaluating large language models for alignment, safety, and applicability.
Natural Language Understanding (NLU): Experience with NLU libraries and tools like NLTK or GPT-3.
LLM Inference Tuning: Experience with LLM inference tuning using efficiency strategies such as LoRA and QLoRA.
LLM Fine-Tuning: Proficiency in fine-tuning LLMs to meet specific use cases.
Cloud Experience: Hands-on experience with one of the cloud platforms: GCP, Azure, or AWS.

Nice-to-Have Skills:

Experience with GCP tools such as BigQuery, Dataflow, Dataproc, Vertex AI, and Pub/Sub for data processing and analysis.
Knowledge of CI/CD pipelines and tools such as Cloud Build and Artifact Registry.
Experience working in highly regulated environments and addressing ethical concerns in AI.

Praca Warszawa