Solvd is looking for a highly skilled Data Engineer to help design and build modern data platforms that support large-scale risk and compliance initiatives for enterprise clients. In this role, you will architect and implement scalable data systems, real-time pipelines, and analytics capabilities that transform complex datasets into actionable insights.
You will partner with cross-functional teams—including data scientists, software engineers, and product stakeholders—to enable reliable, high-quality data solutions. This role requires strong technical depth in big data technologies, AWS services, and distributed systems, along with the ability to understand business needs and contribute to strategic decision-making.
responsibilities :
Design, build, and optimize scalable data infrastructure for risk and compliance workloads
Develop batch and streaming pipelines using modern big data technologies
Implement robust ETL and ELT workflows across diverse structured and unstructured data sources
Work extensively with AWS services including EMR, Redshift, S3, Glue, Lambda, Kinesis, and related data tools
Build high-throughput, low-latency systems that support real-time or near-real-time decision making
Partner with data scientists, analysts, and engineering teams to deliver reliable and well-documented datasets
Translate business requirements into technical specifications and scalable data solutions
Mentor team members and contribute to best practices and reusable frameworks within Solvd
Streamline reporting, analytics, and data preparation processes
Replace manual workflows with automated, repeatable systems
Support ongoing performance tuning, monitoring, and platform optimization
requirements-expected :
Bachelor’s degree in Computer Science, Engineering, Mathematics, or similar
3 or more years of hands-on experience in data engineering or related fields
Strong proficiency in SQL and data modeling for analytics and warehousing
Experience building ETL or ELT pipelines at scale
Experience with big data technologies such as Hadoop, Hive, Spark, HBase, or EMR
Knowledge of distributed systems and data storage principles