Collaborate with AI engineers, data scientists, product owners, and other developers in Agile teams to integrate LLMs into scalable, robust, fair and ethical end-user applications, focusing on user experience, relevance, and real-time performance
Design and implement scalable, high-performance data pipelines for AI/GenAI applications, ensuring efficient data ingestion, transformation, storage and retrieval; integrate different databases, requiring understanding of data architectures / Domain data ecosystem
Work with vector databases (e.g., AWS OpenSearch or Azure AI Search) to store and retrieve high-dimensional data for Generative AI workloads
Build and maintain cloud-based data solutions using AWS OpenSearch, S3 or Azure AI Search, Azure Blob Storage
Design and optimize data storage and processing using Snowflake for scalable, cloud-native analytics solutions
Develop ETL/ELT pipelines to enable real-time and batch data processing
Collaborate with AI/ML Engineers and Data Scientists to ensure seamless integration of data pipelines with AI finetuning, inference and training workflows
Optimize data storage, retrieval, and processing strategies for efficiency, scalability, and cost-effectiveness
Implement data governance, security best practices, and compliance measures aligned with standards
Set up monitoring, alerting, and logging for data pipelines, ensuring high availability and reliability
Wymagania
3+ years of experience in data engineering, preferably supporting AI/ML applications
Proficiency in Python, SQL and vector database native languages
Experience with relational, NoSQL, vector databases, and Snowflake (pref)
Hands-on experience with AWS (OpenSearch, S3, Lambda) or Azure (Azure AI Search, Azure Blob Storage, Azure Automation)
Experience building scalable ETL/ELT workflows using dbt, Apache Airflow
Understanding of encryption and access controls
Familiarity with Git, CI/CD, Docker, Kubernetes, Terraform and CloudFormation
Experience working with AI-specific data needs (embeddings, RAG)and LLM fine-tuning data preparation
Excellent analytical skills and the ability to handle complex challenges with innovative solutions
Sc., B.Eng. (or higher) in Computer Science, Data Engineering or related fields
Team-oriented, proactive, and collaborative
Excellent verbal and written communication skills (in English and Polish)