Informacje o stanowisku
As a Data Engineer on our Development Team, you will design, develop, and optimize data pipelines within an AWS ecosystem for a US-based B2B marketplace company. Your expertise in PySpark will be instrumental in processing large-scale datasets, ensuring the reliability and performance of our data systems. You will collaborate with cross-functional teams, including data scientists and analysts, to deliver high-impact solutions that support business objectives.
- Bachelor’s degree in Computer Science, Engineering, or a related field
- 4 years+ of experience in data engineering, with a focus on building and optimizing data pipelines using PySpark
- Strong experience with AWS services, including S3, Glue, Lambda, EMR, and Redshift
- Proficiency in Python programming and familiarity with related frameworks and libraries
- Solid understanding of distributed computing and experience with Apache Spark
- Hands-on experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation) is a plus
- Strong analytical and problem-solving skills, with attention to detail and a proactive approach to troubleshooting
- Excellent communication and collaboration skills, with the ability to work in a dynamic, team-oriented environment
- Upper-Intermediate level of English
As a Data Engineer on our Development Team, you will design, develop, and optimize data pipelines within an AWS ecosystem for a US-based B2B marketplace company. Your expertise in PySpark will be instrumental in processing large-scale datasets, ensuring the reliability and performance of our data systems. You will collaborate with cross-functional teams, including data scientists and analysts, to deliver high-impact solutions that support business objectives.
,[Design, develop, and implement data pipelines using PySpark within AWS environments, Leverage AWS services such as S3, Glue, EMR, Lambda, and Redshift for building scalable data solutions, Optimize PySpark workflows for performance, reliability, and cost-efficiency, Collaborate with stakeholders to understand data requirements and translate them into technical solutions, Ensure data quality and integrity through robust testing and monitoring processes, Implement data governance, security, and compliance best practices in all development activities, Document technical designs, processes, and workflows to support ongoing maintenance and team knowledge sharing. Requirements: Python, SQL, AWS, Spark, Glue, Redshift Tools: Agile, Scrum. Additionally: Flexible working hours and remote work possibility, Mentoring program, Training budget, English lessons, Compensation of Certifications, Active tech community, International team, Referral program, Modern office, Free coffee, Kitchen, Friendly atmosphere.
Praca WrocławWrocław - Oferty pracy w okolicznych lokalizacjach