Design, build, and maintain EL(T) data pipelines using SQL,
Work with data modeling and create conceptual, logical, and physical data models,
Ensure effective data processing and maintain data architecture (including ETL/ELT, real-time streaming, data quality, distributed systems),
Manage data pipelines and data storage across multiple platforms (filesystem, relational, NoSQL, etc.),
Work with cloud technologies and DevOps/DataOps tools for efficient data handling and workflow automation,
Use Git and related DevOps tools for version control and continuous integration,
Collaborate with other teams on troubleshooting and performance analysis as needed,
Apply knowledge of distributed data warehousing to optimize data processes (e.g., Redshift).
requirements-expected :
4+ years of experience working with programming languages focused on data pipelines (e.g., Python or R),
4+ years of experience working with SQL,
3+ years of experience in data pipelines maintenance,
3+ years of experience with different types of storage (filesystem, relational, MPP, NoSQL) and working with various kinds of data (structured, unstructured, metrics, logs, etc.),
3+ years of experience working in data architecture concepts (e.g., data modeling, metadata management, workflow management, ETL/ELT, real-time streaming, data quality, distributed systems),
3+ years of cloud technology experience with a focus on data pipelines (Airflow, Glue, Dataflow, etc.),
Excellent knowledge of Git, Gitflow, and DevOps tools (e.g., Docker, Bamboo, Jenkins, Terraform),
Git knowledge (must-have),
DBT certified,
Skilled in data processing,
Experience with data modeling and designing, conceptual, logical, and physical data models
Experience designing EL(T) data pipelines using SQL,
Familiar with traditional DW relational concepts (Dimensions, Facts, star schema, snowflake, SCDs),
Understanding of distributed data warehousing fundamentals (Redshift),
Very good knowledge of data serialization languages such as JSON, XML, YAML.