Data Collection Tool for OAB CNA Directory (cna.oab.org.br)

Data Collection Tool for OAB CNA Directory (cna.oab.org.br)

Data Collection Tool for OAB CNA Directory (cna.oab.org.br)

Upwork

Upwork

Remoto

7 horas atrás

Nenhuma candidatura

Sobre

Objective Develop a crawler to systematically query https://cna.oab.org.br/ and collect attorney data by iterating through OAB registration numbers. Scope For each UF (27 Seccionais), perform sequential queries for OAB numbers in a defined range (e.g., 1 to 600,000). For every query, collect the data returned on the first results page: Attorney name OAB number UF Extended Scope (Preferred) When a result is found, click the result link and extract all available details displayed inside the result page, including data rendered as images (OCR may be required). Key Constraints The website blocks access after ~15 rapid/manual queries. The solution must include: Strict rate limiting Backoff and pause handling Session and error control Resume capability (UF + number) Deliverables Runnable crawler with configuration for number ranges and UFs Structured output (database or CSV/JSON) Logging and resume support Clear documentation Notes Focus on stability and controlled execution, not speed. Experience with crawling under heavy rate limits and OCR is required.