override_extract
override_extract(response, ctx) completely replaces SpyWeb’s internal HTML/CSS scraper. Use this to parse non-HTML data (like JSON or XML) or apply entirely custom extraction logic.
Signature
Section titled “Signature”function override_extract(response, ctx) -> table | nil | falseParameters
Section titled “Parameters”| Field | Type | Description |
|---|---|---|
response |
table | The response object from the fetch phase (after after_fetch has run) |
Returns
Section titled “Returns”| Return | Effect |
|---|---|
| Array of structured items | The items must use the standard item structure ({ fields = { ... } }). The keys within fields must match the fields defined in jobs.toml. Output acts exactly like native extraction and will still pass through after_extract if defined. |
nil or false |
Treats as zero items (no extraction) |
Example
Section titled “Example”function override_extract(response, ctx) local data = json_decode(response.body) local items = {} for i, post in ipairs(data.posts) do table.insert(items, { fields = { author = post.user.name, message = post.content } }) end return itemsendSee Also
Section titled “See Also”- after_extract - Batch filter or modify all extracted items