.
Java/Scala Developer
  • Warszawa
Java/Scala Developer
Warszawa, Warszawa, Województwo mazowieckie, Polska
Link Group
15. 10. 2024
Informacje o stanowisku

Job Overview:

We are seeking a highly skilled Mid to Senior Java/Scala Developer with strong foundational knowledge in computer science and expertise in distributed systems, particularly Apache Spark. This role will involve working on large-scale data processing, optimizing data pipelines, and implementing open-source table formats like Apache Iceberg, Delta Lake, and Apache Hudi. You will also work with open Lakehouse architecture tools such as Unity Catalog, Polaris Catalog, and MLflow. While we prefer candidates with mid to senior-level experience, fresh graduates with a Master’s degree in a relevant field are also welcome to apply.


Key Responsibilities:

  • Software Development: Design, develop, and optimize distributed applications using Java and Scala to support large-scale data processing and analytics.
  • Apache Spark Expertise: Leverage deep knowledge of Apache Spark internals to build efficient data pipelines, focusing on performance and scalability.
  • Table Formats: Work with open-source table formats like Apache Iceberg, Delta Lake, Apache Hudi, and Apache Xtable to manage large datasets effectively.
  • Lakehouse Architecture: Implement solutions for open Lakehouse architectures, including cataloging systems like Unity Catalog and Polaris Catalog, and manage machine learning workflows using MLflow.
  • Computer Science Fundamentals: Apply a strong understanding of data structures, caching mechanisms, networking, and database management to build efficient systems.
  • Collaboration: Work closely with data engineers, machine learning engineers, and other stakeholders to deliver solutions that meet business requirements.


Key Requirements:

  • Experience:
  • Mid to senior-level (3+ years) experience in Java or Scala development.
  • Fresh graduates with a Master’s degree in Computer Science or a related field are also encouraged to apply.
  • Computer Science Knowledge: Strong understanding of core concepts such as data structures, caching strategies, networking, and database systems.
  • Apache Spark: In-depth knowledge of Apache Spark internals, including job execution, query optimization, and distributed data processing.
  • Open Source Table Formats: Hands-on experience with Apache Iceberg, Delta Lake, Apache Hudi, or Apache Xtable.
  • Lakehouse & MLflow: Familiarity with open Lakehouse architecture tools such as Unity Catalog, Polaris Catalog, and MLflow for data management and machine learning integration.
  • Problem-Solving: Ability to identify, troubleshoot, and solve complex issues in large-scale distributed systems.


Preferred Qualifications:

  • Cloud Platforms: Experience with cloud platforms like AWS, GCP, or Azure.
  • CI/CD & Version Control: Familiarity with CI/CD pipelines and version control systems such as Git and GitHub.
  • Machine Learning Integration: Understanding of machine learning workflows and their integration into data pipelines.

  • Praca Warszawa
  • Warszawa - Oferty pracy w okolicznych lokalizacjach


    109 179
    17 765