Data Engineer (Databricks, Spark, SQL)

Nazwa pozycji Data Engineer (Databricks, Spark, SQL)

Lokalizacja Warszawa, Warszawa, mazowieckie, Polska

Firma 7N Sp z o.o.

Dodano 27. 3. 2025

Data Engineer (Databricks, Spark, SQL)

93_8445

Obowiązki

Building Data Pipelines: Develop and maintain efficient ETL/ELT processes to collect and transform large-scale data from vessel-based systems, including sensors, auto-logs, operational tools, and fleet management platforms
Cloud-Based Data Processing: Optimize and manage data workflows in cloud environments such as AWS and Databricks, ensuring high performance, reliability, and data integrity
Data Modeling & Architecture: Design and implement structured data models and architectures that align with business objectives, enabling effective data-driven strategies
Cross-Team Collaboration: Partner with teams across fleet management, network operations, and sustainability to identify data needs and deliver suitable technical solutions
Automation & Deployment: Establish CI/CD pipelines for automated testing, deployment, and seamless integration of data solutions to improve operational efficiency
DevOps Integration: Apply DevOps best practices and tools like Jenkins, GitLab CI/CD, Docker, and Kubernetes to enhance the development and deployment of data services
Data Analysis Support: Assist Data Scientists and BI teams by providing well-structured datasets and improving query performance for advanced analytics
Innovation & Optimization: Keep up with the latest advancements in Data Engineering, DevOps, and Cloud Computing to refine workflows and adopt industry-leading practices

Wymagania

Min. 4 years of experience on a similar position
Extensive experience with AWS cloud services, data lakes, and big data technologies, with preference for expertise in Databricks, Redshift, or Snowflake
Strong command of Python, PySpark, and SQL, particularly in the context of building scalable and efficient data pipelines
Hands-on knowledge of DevOps tools like Docker and Kubernetes, as well as CI/CD pipelines such as Jenkins or GitLab CI/CD to support continuous integration and deployment
Familiarity with orchestration tools like Apache Airflow for managing and automating complex data workflows
A solid grasp of data governance best practices, security protocols, and regulatory compliance requirements
Demonstrated capability to diagnose and resolve complex issues, optimize performance, and apply industry best practices in data engineering
Proficiency in business English (min. B2), both written and spoken.
Nice to have: prior experience in application management or operational procedures within the logistics or shipping industry