We are seeking a skilled Big Data Engineer to join our data engineering team. The ideal candidate will have extensive experience in building and managing large-scale data processing systems. This role involves designing, implementing, and optimizing data pipelines and infrastructure to support analytics, machine learning, and business intelligence efforts.
This role offers an opportunity to work on complex, large-scale data projects and help shape the future of data-driven decision-making. If you are passionate about big data technologies and thrive in a fast-paced, innovative environment, we encourage you to apply.
MD rate: 200 - 240 EUR
responsibilities :
Design, develop, and maintain big data pipelines to process and analyze large datasets.
Implement data ingestion, processing, and storage solutions using big data frameworks such as Apache Spark, Hadoop, and Kafka.
Optimize data pipelines for performance, scalability, and fault tolerance.
Collaborate with data scientists, analysts, and other stakeholders to ensure data availability and usability.
Develop and maintain data storage solutions such as HDFS, Amazon S3, Google Cloud Storage, or Azure Data Lake.
Ensure data quality and integrity through automated testing and validation processes.
Monitor and troubleshoot big data infrastructure to ensure optimal performance and reliability.
Document technical solutions, workflows, and best practices.
requirements-expected :
Proficiency in big data technologies such as Apache Spark, Hadoop, Kafka, or Flink.
Strong programming skills in languages like Python, Scala, or Java.
Experience with SQL and NoSQL databases such as PostgreSQL, MongoDB, or Cassandra.
Familiarity with cloud platforms such as AWS, Azure, or Google Cloud, including their big data services (e.g., EMR, BigQuery, Databricks).
Knowledge of data modeling, ETL processes, and data pipeline orchestration tools like Apache Airflow, Luigi, or Dagster.
Strong understanding of distributed computing principles and parallel processing.
Experience with containerization tools such as Docker and orchestration tools like Kubernetes.
Strong problem-solving skills and ability to troubleshoot large-scale data systems.