Location
pajo
Job Type
Full-time
Posted
June 03, 2026
Job Description
Role Overview
Data Extraction Engineer designs extraction systems (and not just scripts). They build and maintain a next‑generation data acquisition platform that treats web scraping as a declarative, specification‑driven discipline. Instead of hard‑coding XPaths for every site, the developer defines what data is needed—using schemas, natural language descriptions, or visual blueprints—and lets intelligent pipelines figure out how to get it.
Key Responsibilities
- Design and maintain declarative extraction specifications—using Pydantic models, JSON schemas, or domain‑specific languages—that describe exactly which fields to capture, their types, and validation rules.
- Implement pipelines that translate these specifications into executable extraction plans, leveraging both classical (Scrapy, Playwright) and AI‑augmented (LLM‑based semantic parsing) backends.
- Build reusable specification libraries for recurring data types (product p...