The Crawl API allows you to programmatically crawl and index websites into your knowledge base.
This enables your AI agents to access and reference content from your website, documentation,
or any other web-based resources when responding to customer inquiries.
Overview
Website crawling in our system enables:
- Automated Content Indexing - Automatically extract and index content from websites
- Knowledge Base Integration - Crawled content is added directly to your knowledge base
- Real-time Status Tracking - Monitor crawl progress and completion status
- Flexible Configuration - Control crawl depth, include/exclude paths, and set limits
- Batch Processing - Handle large websites efficiently with paginated results
How It Works
- Submit URL - Provide a website URL to start crawling
- Configure Options - Set crawl limits, include/exclude paths, and other parameters
- Monitor Progress - Check crawl status and track completion
- Access Content - Crawled content is automatically indexed and available to your AI agents
Crawl Status Types
scraping - Crawl is actively running and extracting content
completed - Crawl finished successfully, content has been indexed
failed - Crawl encountered an error and could not complete
cancelled - Crawl was manually cancelled before completion
Crawling large websites can take significant time and resources. Consider using include/exclude
paths to focus on relevant content and set appropriate limits to avoid excessive crawling.
Available Endpoints