Data Engineer (Databricks, Spark, Python)

Warszawa

Nazwa pozycji Data Engineer (Databricks, Spark, Python)

Lokalizacja Warszawa, Warszawa, mazowieckie, Polska

Firma 7N Sp. z o.o.

Dodano 29. 1. 2026

Informacje o stanowisku

Data Engineer (Databricks, Spark, Python)

93_8789

Obowiązki

Design and build reliable data pipelines to efficiently ingest, combine, and transform data from multiple sources, including internal systems, IoT, fleet data, and external providers
Guarantee smooth and consistent integration of data into the Data Lake while meeting data quality and integrity standards
Apply advanced knowledge of Databricks and Apache Spark to develop, enhance, and optimise data processing workflows within the Data Lake
Use Spark to handle large-scale data processing, execute complex transformations, and support data aggregation for analytical purposes
Take responsibility for data storage architecture and management in AWS, including organising data across appropriate buckets and zones
Partner with IT administrators to maintain proper access management and ensure strong data security controls
Implement data tokenisation solutions to protect sensitive information and comply with data protection regulations
Regularly monitor Data Lake performance and proactively identify areas for optimisation
Improve pipeline efficiency and storage design to increase data access speed and overall system performance
Work closely with the Product Group (data owners) to understand business needs, offer technical expertise, and maintain data quality across the entire data lifecycle
Collaborate with data scientists to ensure data accessibility and readiness for analytical and business use cases
Provide advanced technical support to Data Lake users and the engineering team, resolving issues related to ingestion, processing, and access
Investigate, troubleshoot, and fix data-related incidents promptly and effectively
Advocate and enforce best practices in data governance, ensuring compliance with data standards and proper documentation
Keep thorough and up-to-date documentation of data pipelines, processes, and data lineage for transparency and future reference

Wymagania

Minimum 5 years of professional experience as a Data Engineer, with strong emphasis on designing and maintaining Data Lake architectures
Extensive knowledge of AWS services, particularly in data storage and processing using S3, Databricks, and Spark
Proven hands-on experience with Databricks, Apache Spark, and Python for large-scale data processing and performance optimisation
Strong proficiency in big data file formats such as Parquet, Iceberg, and Delta
Good understanding of data protection regulations and best practices for handling personally identifiable information (PII)
Familiarity with data visualisation and reporting tools (e.g., Qlik Sense) is an advantage
Experience with containerisation technologies and Infrastructure as Code, particularly Terraform
Practical knowledge of workflow orchestration tools such as Airflow
Proficiency in English