description
Load data from Apify Website Content Crawler.

Apify Website Content Crawler

Apify is a web scraping and data extraction platform that provides an app store with more than a thousand ready-made cloud tools called Actors.

The Website Content Crawler Actor can deeply crawl websites, clean their HTML by removing a cookies modals, footers, or navigation, and then transform the HTML into Markdown. This Markdown can then be stored in a vector database for semantic search or Retrieval-Augmented Generation (RAG).

Apify Website Content Crawler Node

Crawl Entire Website

(Optional) Connect Text Splitter.
Connect Apify API (create a new credential with your Apify API token).
Input one or more URLs (separated by commas) where the crawler will start, e.g https://docs.flowiseai.com/.
Select the crawler type. Refer to Website Content Crawler documentation for more information.
(Optional) Specify additional parameters such as maximum crawling depth and the maximum number of pages to crawl.

Output

Loads website content as a Document.

Resources

Apify-Flowise integration
Website Content Crawler

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

apify-website-content-crawler.md

apify-website-content-crawler.md

Apify Website Content Crawler

Crawl Entire Website

Output

Resources

Files

apify-website-content-crawler.md

Latest commit

History

apify-website-content-crawler.md

File metadata and controls

Apify Website Content Crawler

Crawl Entire Website

Output

Resources