Overview
Every scrape run flows through a series of pipeline hooks you can define in your Lua script. Each hook receives a request, response, or items table plus the shared ctx, and can modify data, skip stages, or inject side effects.
Key Rules
Section titled “Key Rules”- Return the value (modified or not) to continue the pipeline
- Return
nilorfalseto drop/skip (behavior varies per hook) - Hook errors are logged and skipped, they never crash the job
- Only define the hooks you actually need. SpyWeb pre-detects which functions exist at startup and skips the processing logic entirely for any that are missing.
Every hook receives a per-cycle ctx table as its second argument. See Context for the full reference.
Pipeline Stages
Section titled “Pipeline Stages”The hooks execute in this exact order during each scrape cycle:
Modify URL, add headers, or return nil to skip this run entirely.
Custom fetching phase (bypasses built-in HTTP client if defined).
Automatic - uses request config from previous stage.
Inspect success or failure, mutate body on success, or return a synthetic response to recover from fetch errors.
Custom extraction phase for JSON/XML (bypasses built-in CSS extraction if defined).
Automatic - raw parser with DOM fallback.
Batch filter/modify all items. nil or empty = no items.
Per-item filter. Replaces built-in keyword filter if defined.
Last chance before DB insert. nil = skip store + notify.
Automatic - atomic check-and-insert.
Reshape or silence notifications. Items already stored.
Reshape or silence webhook POSTs. Full JSON payload.
Automatic - desktop notification + webhook POST.