Extraction rules

Pass URL-encoded JSON to extract_rules to get structured output from the target page. Each rule has a CSS selector, an optional output type, and a flag to return all matches. Extraction rules work with or without JavaScript rendering.

Endpoint

GET https://scrape.shifter.io/v1?api_key=YOUR_API_KEY&url=<TARGET_URL>&extract_rules=<URL_ENCODED_JSON>

Parameters

Parameter	Type	Required	Description
`api_key`	string	yes	Your Web Scraping API key
`url`	string	yes	Target URL to fetch
`extract_rules`	string	yes	URL-encoded JSON describing the extraction rules.

Rule object

Field	Type	Required	Description
`selector`	string	yes	CSS selector.
`output`	string	no	`html` (default), `text`, or `@attr` to pull an attribute.
`all`	string	no	`"1"` to return all matches as an array, `"0"` (default) for the first match.

Example request

Extract the page title as text:

{"title": {"selector": "h1", "output": "text"}}

curl "https://scrape.shifter.io/v1?api_key=YOUR_API_KEY&url=https%3A%2F%2Fexample.com&extract_rules=%7B%22title%22%3A%20%7B%22selector%22%3A%20%22h1%22%2C%20%22output%22%3A%20%22text%22%7D%7D"

import json
import requests

rules = {"title": {"selector": "h1", "output": "text"}}
r = requests.get("https://scrape.shifter.io/v1", params={
    "api_key": "YOUR_API_KEY",
    "url": "https://example.com",
    "extract_rules": json.dumps(rules),
})
print(r.json())

import fetch from 'node-fetch';

const rules = { title: { selector: 'h1', output: 'text' } };
const url = 'https://scrape.shifter.io/v1?' + new URLSearchParams({
  api_key: 'YOUR_API_KEY',
  url: 'https://example.com',
  extract_rules: JSON.stringify(rules),
});
const res = await fetch(url);
console.log(await res.json());

Example response

{ "title": "Example Domain" }

Get Started

Proxies

APIs

Resources