We’re looking for a Web Scraping Specialist to join a focused team. Your role will involve leading data-gathering initiatives, refining collection methodologies, and analyzing results. The ideal candidate will help us advance our mission to reshape how publicly available online information is accessed and utilized.
Contract signed through EOR
responsibilities :
Script Development: Build, debug, and enhance reliable code to systematically collect data from the web.
Complex Navigation: Solve technical challenges like pagination and AJAX/JavaScript-driven content loading during data retrieval.
Data Processing: Clean, validate, and structure raw data into usable formats.
Storage & Management: Implement and optimize database solutions for storing large volumes of collected data.
System Maintenance: Monitor automated processes, perform root-cause analysis on failures, and ensure pipeline health.
requirements-expected :
Independently capable of collecting data from sophisticated and dynamic web interfaces, with a proven track record shown through previous work samples.
Strong programming skills in Python and JavaScript, with extensive experience using common data extraction toolkits and browser automation frameworks.
Understanding of concurrent execution models (async/threading) and strategies for scalable, distributed data collection.
Solid foundation in core web technologies: HTML structure, CSS selectors, client-side JavaScript, and DOM manipulation.
Hands-on experience with schema-flexible databases (e.g., MongoDB) for designing optimal data storage and ensuring consistency.
Experience applying ML techniques for tasks like data normalization, classification, or forecasting.
Familiarity with major cloud platforms (AWS, GCP, Azure) for deploying and orchestrating large-scale data collection tasks.
A demonstrated contribution to community-driven projects in the domains of data acquisition or processing is a plus.