Overview
Website crawling in our system enables:- Automated Content Indexing - Automatically extract and index content from websites
- Knowledge Base Integration - Crawled content is added directly to your knowledge base
- Real-time Status Tracking - Monitor crawl progress and completion status
- Flexible Configuration - Control crawl depth, include/exclude paths, and set limits
- Batch Processing - Handle large websites efficiently with paginated results
How It Works
- Submit URL - Provide a website URL to start crawling
- Configure Options - Set crawl limits, include/exclude paths, and other parameters
- Monitor Progress - Check crawl status and track completion
- Access Content - Crawled content is automatically indexed and available to your AI agents
Crawl Status Types
scraping
- Crawl is actively running and extracting contentcompleted
- Crawl finished successfully, content has been indexedfailed
- Crawl encountered an error and could not completecancelled
- Crawl was manually cancelled before completion
Crawling large websites can take significant time and resources. Consider using include/exclude
paths to focus on relevant content and set appropriate limits to avoid excessive crawling.