We are looking for an experienced Data Engineer with strong PySpark skills to join a large-scale Azure data platform project. You will work in a distributed environment, focusing on building and optimizing data pipelines.
responsibilities :
Develop and maintain data pipelines using PySpark and SQL
Work with Azure Databricks for large-scale data processing
Build and maintain ingestion and data transformation workflows
Perform data wrangling and integrate multiple data sources
Optimize Spark jobs (DataFrames, partitioning, clustering, SparkSQL)
Work with data formats such as Delta, Parquet, and CSV
Collaborate with stakeholders across multiple time zones