curl --request POST \
--url https://api.open.cx/crawl \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"url": "https://example.com",
"limit": 100,
"excludePaths": [
"/blog/*",
"/private/*"
],
"allowExternalLinks": true,
"includePaths": [
"/blog/*",
"/private/*"
]
}
'{
"id": "<string>",
"org_id": "<string>",
"url": "<string>",
"status": "cancelled",
"created_at": "<string>",
"updated_at": "<string>",
"completed_at": "<string>",
"completed_pages": 123,
"error_message": "<string>",
"total_pages": 123
}Submit a website URL to be crawled and indexed into your knowledge base. The crawler will extract content from the specified URL and any linked pages within the same domain.
curl --request POST \
--url https://api.open.cx/crawl \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"url": "https://example.com",
"limit": 100,
"excludePaths": [
"/blog/*",
"/private/*"
],
"allowExternalLinks": true,
"includePaths": [
"/blog/*",
"/private/*"
]
}
'{
"id": "<string>",
"org_id": "<string>",
"url": "<string>",
"status": "cancelled",
"created_at": "<string>",
"updated_at": "<string>",
"completed_at": "<string>",
"completed_pages": 123,
"error_message": "<string>",
"total_pages": 123
}Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
The URL to crawl
"https://example.com"
Maximum number of pages to crawl (1-1000)
1 <= x <= 1000100
Paths to exclude from crawling
["/blog/*", "/private/*"]Whether to allow external links
true
Paths to include in crawling
["/blog/*", "/private/*"]Crawl job has been created successfully.
cancelled, completed, failed, scraping Was this page helpful?