Skip to content

laramohan/wikillm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Edited Wikipedia Brain

LLMs as Collaboratively Edited Knowledge Bases

🐦 Twitter

WikiLLM

Algorithms like MEMIT enable us to inject facts into an LLM by editing its parameters 💉🧠. Could we use fact editing to crowdsource a continually updated neural knowledge base—with no RAG or external documents?

WikiLLM uses fact editing to transform the static piles of floats that are current LLMs into dynamically evolving knowledge bases that learn from interaction with users.

As users engage in conversations with WikiLLM, GPT-4 extracts facts from the conversations worth inserting & formats them (identifies subject, predicate, etc.) for MEMIT. These facts are then inserted into the underlying Llama 7B model using the EasyEdit implementation of MEMIT.

Setup:

Tested on an 80 GB A100 with Torch 2.1

git clone https://github.com/laramohan/wikillm

curl -sSL https://install.python-poetry.org | python3 -
echo 'export PATH="/root/.local/bin:$PATH"' >> .bashrc
source .bashrc

cd wikillm
git clone https://github.com/zjunlp/EasyEdit

poetry shell
pip install -r EasyEdit/requirements.txt
pip install easyeditor

poetry install
poetry remove openai
poetry add openai==1.2.0

curl -s https://ngrok-agent.s3.amazonaws.com/ngrok.asc | sudo tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null && echo "deb https://ngrok-agent.s3.amazonaws.com buster main" | sudo tee /etc/apt/sources.list.d/ngrok.list && sudo apt update && sudo apt install ngrok
./ngrok http 5000

Notes:

  • some edits were required to files in EasyEdit to get it working for me, among them:
    • commenting out Blip & Multimodal model imports
    • changing hparams.yaml device to 0
  • to save on HuggingFace cache space, symlink your main .cache snapshots to point to blobs in EasyEdit/hugging_cache

This project remains a work in progress, and main issues include:

  • fact editing takes surprisingly long (try FastEdit)
  • model probably degrades with tons of edits (this is part of the experiment I guess)
  • frontend isn't fully hooked up (would love contributions here too)
  • no idea what happens with multiple concurrent connections trying to edit at the same time

About

LLMs as Collaboratively Edited Knowledge Bases

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published