This project focuses on leveraging cutting-edge AI and generative AI (GenAI) technologies to enhance tax operations, including consolidation, reconciliation, reporting, and interpretation of unstructured tax documentation. You will work closely with engineers, data scientists, and business analysts to integrate large language models (LLMs) and advanced AI techniques into core tax processes, while ensuring compliance with regulatory standards.
Senior Data Scientist
Your responsibilities
- Collaborate with engineers, data scientists, and business analysts to understand requirements, refine models, and integrate LLMs into AI solutions.
- Apply Reinforcement Learning with Human Feedback (RLHF) and advanced AI techniques for tax-specific outputs.
- Embed generative AI solutions into consolidation, reconciliation, and reporting processes.
- Leverage LLMs to process and interpret unstructured tax documentation.
- Develop and implement deep learning algorithms for AI solutions.
- Preprocess raw data (text normalization, tokenization, etc.) for NLP model use.
- Setup, train, and fine-tune large language models and other state-of-the-art neural networks.
- Conduct testing, validation, and statistical analysis to ensure accuracy and optimize performance.
- Explore and propose innovative AI use cases to enhance tax functions.
- Partner with tax, finance, IT, and legal teams to integrate AI workflows and comply with regulatory standards.
- Perform model audits to identify and mitigate risks.
- Monitor and optimize generative models for performance and scalability.
Our requirements
- Minimum 6+ years of hands-on experience in data science.
- Strong programming skills in Python, PyTorch, TensorFlow, and related libraries.
- Solid understanding of object-oriented design patterns, concurrency/multithreading, and scalable AI/GenAI model deployment.
- Proficiency in NLP techniques: RegEx, SpaCy, NLTK, text representation, and semantic extraction.
- Hands-on experience developing, training, and fine-tuning LLMs and AI models.
- Practical knowledge of CNN, RNN, GANs, RAG, LangChain, and Transformers.
- Expertise in prompt engineering techniques and vector databases.
- Familiarity with Azure Cloud Computing Platform.
- Experience with Docker, Kubernetes, and CI/CD pipelines.
- Experience with Deep Learning, Computer Vision, CNN, RNN, LSTM.
- Knowledge of vector databases (Milvus, Postgres, etc.) and other database technologies.