Skip to main content
POST
/
crawl
Submit a URL for crawling
curl --request POST \
  --url https://api.open.cx/crawl \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "url": "https://example.com",
  "limit": 100,
  "excludePaths": [
    "/blog/*",
    "/private/*"
  ],
  "allowExternalLinks": true,
  "includePaths": [
    "/blog/*",
    "/private/*"
  ]
}'
{
  "id": "<string>",
  "org_id": "<string>",
  "url": "<string>",
  "status": "cancelled",
  "created_at": "<string>",
  "updated_at": "<string>",
  "completed_at": "<string>",
  "completed_pages": 123,
  "error_message": "<string>",
  "total_pages": 123
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
url
string
required

The URL to crawl

Example:

"https://example.com"

limit
number
default:100

Maximum number of pages to crawl (1-1000)

Required range: 1 <= x <= 1000
Example:

100

excludePaths
string[]

Paths to exclude from crawling

Example:
["/blog/*", "/private/*"]

Whether to allow external links

Example:

true

includePaths
string[]

Paths to include in crawling

Example:
["/blog/*", "/private/*"]

Response

Crawl job has been created successfully.

id
string
required
org_id
string
required
url
string
required
status
enum<string>
required
Available options:
cancelled,
completed,
failed,
scraping
created_at
string
required
updated_at
string
required
completed_at
string | null
required
completed_pages
number | null
required
error_message
string | null
required
total_pages
number | null
required
I