Project enables find, access, processing, publication and sharing of biomedical data to generate insight for secondary use. It also contains EDIS end to end engine for secondary use and primary exploration and integration with externally generated data (RWD) from both clinical and non-clinical sources.
responsibilities :
Design, develop, and deploy solutions based on large language models (LLM), including agent design and tool utilization for problem-solving.
Fine-tune LLM models and optimize their performance to meet specific business requirements.
Develop and optimize data pipelines and deployment pipelines for LLM-based applications.
Integrate clinical, non-clinical, and external real-world data (RWD) from various sources.
Work with ML/AI tools, including AWS SageMaker, PyTorch, TensorFlow, Vertex AI, and implement MLOps solutions using tools like Kubeflow.
Create scripts and automate processes using tools like Git, Bash, Docker, and Kubernetes.
Develop scalable applications in cloud environments (AWS, Azure, GCP).
Implement Continuous Integration / Continuous Deployment (CI/CD) practices using tools like Jenkins or GitLab CI.
Collaborate with teams across different locations and cultures to deliver customer-oriented solutions.
Test and optimize ML models, manage training and testing datasets, and mitigate overfitting.
requirements-expected :
Experience with LLM applications development in particular agentic design such as tool using and reasoning.
Experience in building data pipelines and deployment pipelines for LLM applications.
Recent experience with ML/AI toolkits such as AWS Sagemager (other toolkits like Pytorch, Tensorflow, Keras, MXNet, H20, etc are nice to have).
Experience with MLOps technologies (Sagemaker, Vertex AI, Kubeflow).
Experience with cloud solutions (AWS / Azure / GCP), docker.