Building distributed and highly parallelized Big data processing pipeline which process massive amount of data (both structured and unstructured data) in near real-time
Leveraging Spark to enrich and transform corporate data to enable searching, data visualization, and advanced analytics
Working closely with analysts and business stakeholders to develop analytics models
Continuous delivering on Hadoop and other Big Data Platforms
Automating processes where possible and are repeatable and reliable