Senior Data Scientist
Numer referencyjny: PL24/RL/SeniorDataScientist/Warsaw
In Cyclad we work with top international IT companies to boost their potential in delivering outstanding, cutting-edge technologies that shape the world of the future. We are seeking an experienced Data Scientist to join our customers and be part of their AI Research and Engineering Team, you'll drive high-impact NLP projects that augment the expertise of their lawyers. Your expertise in state-of-the-art NLP libraries, and advanced prompting techniques will be instrumental in developing innovative solutions such as Automating first-draft agreements using custom-built, fine-tuned Large Language Models, enhancing document analytics through metadata extraction and categorization as well as generating tailored risk factors for M&A prospectuses using LLMs.
Project information:
- Industry: Legal services
- Office Location: Warsaw, Poland
- Remote work: Yes with 1-2 days in the office per month
- Contract type: employment contract only (no b2b option)
- Seniority level:Senior
- Methodology: Agile/Scrum
- Contract length: Indefinite
Project scope:
- Research and Develop state-of-the-art AI systems to solve real-world challenges within LegalTech.
- Explore new AI methods through independent study, attending industry-leading conferences, and developing frameworks for scientific experimentation.
- Collaborate with practice teams, data scientists, and data engineers to design, develop, and deploy data science solutions that leverage NLP and large language models to address complex legal challenges, ensuring ethical AI practices and legal compliance.
- Apply NLP techniques for extracting predefined categories of information from large-scale, unstructured data across multi-domain and multilingual legal documents, with a focus on data privacy and security.
- Develop and optimize LLM-based agentic workflows (e.g. Langchain, AutoGen), ensuring model interpretability and explainability for legal applications.
- Develop high-quality, production-ready code for internal use cases, including document drafting, data extraction, and labeling, ensuring scalability, maintainability, and adherence to best practices.
- Design and execute rigorous research frameworks and experimentation plans to evaluate the performance and impact of NLP and Generative AI solutions in the legal context.
- Implement and maintain model monitoring systems to ensure ongoing compliance with legal and ethical standards.
Requirements:
- 5+ years in Data Science with a strong track record in NLP projects, especially document analysis
- Advanced Python skills and expertise in NLP libraries (e.g., Langchain, Hugging Face)
- Proficient in deep learning frameworks (e.g., PyTorch) and large language models (LLMs), with advanced prompt engineering and fine-tuning experience (e.g., LoRa)
- Experience with Azure cloud (AI, ML, Cognitive Services); GCP Vertex AI is a plus
- Skilled in Azure DevOps for version control, model interpretability, and explainability in legal AI
- Familiar with LLMOps for deployment, monitoring, and maintenance
- Relevant conference publications (ACL, EMNLP, NeurIPS, etc.)
- Strong communicator with data-driven collaboration skills
- PhD in Computer Science, Data Science, or AI/ML or a Master with a comparable level of experience is a plus
We offer:
- Full-time job agreement based on employment contract
- Private medical care
- Benefit Multisport club card or subsidized public transportation card
- Life insurance
- Worldwide travel insurance for business and non-business travel
- Wide range of training opportunities
- Free in-house fitness center