Introducing Chat with YouTube, a browser extension that lets you chat with YouTube videos! This is a small project that demonstrates how easy it is to build conversational browser extensions using Hugging Face Inference Endpoints and the Vercel AI SDK.
Since the license is MIT, feel free to fork the project, make improvements, and release it yourself on the Chrome/Firefox Web Store!
We recommend opening up two terminal windows side-by-side, one for the server and one for the extension.
-
Clone the repository
git clone https://github.com/xenova/chat-with-youtube.git
-
Set up the server
-
Switch to the
server
directory:cd server
-
Install the necessary dependencies:
npm install
-
Create a file
.env.local
with your Hugging Face Access Token and Inference Endpoint URL. See.env.local.example
for an example. If you haven't got these yet, this guide will help you get started.HUGGINGFACE_API_KEY=hf_xxx HUGGINGFACE_INFERENCE_ENDPOINT_URL=https://YOUR_ENDPOINT.endpoints.huggingface.cloud
-
Start the server:
npm run dev
-
-
Set up the extension
-
Switch to the
extension
folder:cd extension
-
Install the necessary dependencies:
npm install
-
Build the project:
npm run build
-
Add the extension to your browser. To do this, go to
chrome://extensions/
, enable developer mode (top right), and click "Load unpacked". Select thebuild
directory from the dialog which appears and click "Select Folder". -
That's it! You should now be able to open the extenion's popup and use the model in your browser!
-
Well, it's quite simple actually: just add the transcript and video metadata to the prompt for additional context. For this demo, we use Llama-2-7b-hf, which has a context length of 4096 tokens, and can easily handle most videos. Of course, for longer videos, it would be best to implement segmentation and retrieval augmented generation, but that's beyond the scope of this project.