Skip to content

Fetch an entire site and save it as a text file (to be used with AI models).

License

Notifications You must be signed in to change notification settings

RainbowScientist-Playground/sitefetch

 
 

Repository files navigation

sitefetch

Fetch an entire site and save it as a text file (to be used with AI models).

image

Install

One-off usage (choose one of the followings):

bunx sitefetch
npx sitefetch
pnpx sitefetch

Install globally (choose one of the followings):

bun i -g sitefetch
npm i -g sitefetch
pnpm i -g sitefetch

Usage

sitefetch https://egoist.dev -o site.txt

# or better concurrency
sitefetch https://egoist.dev -o site.txt --concurrency 10

Match specific pages

Use the -m, --match flag to specify the pages you want to fetch:

sitefetch https://vite.dev -m "/blog/**" -m "/guide/**"

The match pattern is tested against the pathname of target pages, powered by micromatch, you can check out all the supported matching features.

Content selector

We use mozilla/readability to extract readable content from the web page, but on some pages it might return irrelevant contents, in this case you can specify a CSS selector so we know where to find the readable content:

sitefetch https://vite.dev --content-selector ".content"

Plug

If you like this, please check out my LLM chat app: https://chatwise.app

API

import { fetchSite } from "sitefetch"

await fetchSite("https://egoist.dev", {
  //...options
})

Check out options in types.ts.

License

MIT.

About

Fetch an entire site and save it as a text file (to be used with AI models).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 86.3%
  • JavaScript 13.7%