.
Data Engineer
  • Warszawa
Data Engineer
Warszawa, Warszawa, mazowieckie, Polska
7N Sp. z o.o.
5. 5. 2026
Informacje o stanowisku

Data Engineer



93_8892

Obowiązki

  • Design, build and maintain production ETL pipelines in Databricks/Delta Lake to ingest RWD (registries, claims, EHR extracts) and transform into standard models.
  • Implement harmonisation workflows to map incoming RWD to OMOP and to the internal CDISC SDTM canonical model; handle vocabulary mapping, units normalization and provenance.
  • Extend the medallion architecture (bronze/silver/gold) patterns with robust validation, lineage, partitioning and performance tuning.
  • Develop configurable, input‑driven transformation frameworks so clinical experts can drive mapping rules via config files and catalogs.
  • Integrate AI/automation components (e.g., model‑assisted mapping, NLP for free text) with human‑in‑the‑loop review and confidence scoring.
  • Establish testing, CI/CD, monitoring and alerting for ETL jobs and automations; ensure reproducibility, versioning and governance.
  • Collaborate with clinical data scientists, data stewards and stakeholders to define requirements, data contracts and success metrics.

Wymagania

  • Proven experience designing and implementing ETL pipelines in Databricks/Spark and Delta Lake.
  • Strong knowledge of OMOP CDM and experience mapping datasets to OMOP; familiarity with CDISC SDTM is a plus.
  • Expertise in data modelling, partitioning, performance tuning, and best practices for large clinical/RWD datasets.
  • Experience with vocabulary services and terminology mapping (OHDSI/Athena, UMLS, or similar).
  • Experience integrating AI/NLP components into data pipelines (entity extraction, mapping suggestions) is desirable.
  • Familiarity with testing frameworks for data (Great Expectations, Deequ), CI/CD, infrastructure as code, and orchestration tools (Databricks Jobs, Airflow).
  • Good communication skills and experience working with domain experts to capture requirements.
  • Fluent English 

Oferujemy

  • Prior experience in pharma or clinical research environments.
  • Knowledge of data governance, privacy regulations and secure handling of patient data.
  • Experience with Unity Catalog, Databricks Delta Sharing, and cloud infrastructure (Azure/AWS).

Źródło: 7n/Praca

 

  • Praca Warszawa
  • Warszawa - Oferty pracy w okolicznych lokalizacjach


    93 499
    14 035