.
Machine Learning Data Engineer
  • Kraków County
Machine Learning Data Engineer
Kraków, Kraków County, Lesser Poland Voivodeship, Polska
ITDS Business Consultants
25. 2. 2025
Informacje o stanowisku

Join us, and enhance data solutions with the latest technologies and tools!

Krakow-based opportunity with the possibility to work 80% remote.

As a Machine Learning Data Engineer , you will be working for our client, a leading global financial institution known for building innovative digital solutions and transforming the banking industry. You will play a key role in supporting their data and digital transformation initiatives by developing and optimizing data engineering processes. Working with cutting-edge technologies, you’ll contribute to the development of robust and scalable data solutions for critical financial services, handling everything from data pipelines to cloud integrations. You’ll be part of a dynamic team working on both greenfield projects and established banking applications.

Your main responsibilities:

  • Developing and optimizing data engineering processes
  • Building robust, fault-tolerant data solutions for both cloud and on-premise environments
  • Automating data pipelines to ensure seamless data flow from ingestion to serving
  • Creating well-tested, clean code in line with modern software engineering principles
  • Working with cloud technologies (AWS, Azure, GCP) to support large-scale data operations
  • Supporting data transformation and migration efforts from on-premise to cloud ecosystems
  • Designing and implementing scalable data models and schemas
  • Maintaining and enhancing big data technologies such as Hadoop, HDFS, Spark, and Cloudera
  • Collaborating with cross-functional teams to solve complex technical problems
  • Contributing to the development of CI/CD pipelines and version control practices

You’re ideal for this role if you have:

  • Strong experience in the Data Engineering Lifecycle, especially in building data pipelines
  • Proficiency in Python, Pyspark, and the Python ecosystem
  • Experience with cloud platforms such as AWS, Azure, or GCP (preferably GCP)
  • Expertise in Hadoop on-premise distributions, particularly Cloudera
  • Experience with big data tools such as Spark, HDFS, HIVE, and Databricks
  • Knowledge of data lake formation, data warehousing, and schema design
  • Strong understanding of SQL and NoSQL databases
  • Ability to work with data formats like Parquet, ORC, and Avro
  • Familiarity with CI/CD pipelines and version control tools like Git
  • Strong communication skills to collaborate with diverse teams

It is a strong plus if you have:

  • Experience with ML models and MLOps
  • Exposure to building real-time event streaming pipelines with tools like Kafka or Apache Flink
  • Familiarity with containerization and DevOps practices
  • Experience in data modeling and handling semi-structured data
  • Knowledge of modern ETL and ELT processes
  • Understanding of the trade-offs between different data storage technologies

#GETREADY to meet with us!

We would like to meet you. If you are interested, please apply and attach your CV in English or Polish, including a statement that you agree to our processing and storing of your personal data. You can also apply by sending us an email at

Address:

#J-18808-Ljbffr

  • Praca Kraków
  • Kraków - Oferty pracy w okolicznych lokalizacjach


    91 114
    11 909