Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources. This approach aims to address the limitations of traditional language models, which may generate responses based solely on their training data, potentially leading to factual errors or inconsistencies. Read What Is Retrieval-Augmented Generation, aka RAG? for more information.
In a co-design session with an AAC (Augmentative and Alternative Communication) user, RAG can be particularly useful. When the user expressed a desire to invite "Roy nephew" to her birthday party, the ambiguity occurred as to whether "Roy" and "nephew" referred to the same person or different individuals. Traditional language models might interpret this statement inconsistently, sometimes treating "Roy" and "nephew" as the same person, and other times as separate persons.
RAG addresses this issue by leveraging external knowledge sources, such as documents or databases containing relevant information about the user's family members and their relationships. By retrieving and incorporating this contextual information into the language model's input, RAG can disambiguate the user's intent and generate a more accurate response.
The RAG experiment is located in the jobs/RAG
directory. It contains these scripts:
requirements.txt
: contains python dependencies for setting up the environment to run the python script.rag.py
: use RAG to address the "Roy nephew" issue described above.
-
If you are currently in a activated virtual environment, deactivate it.
-
Install and start Ollama to run language models locally
- Follow README to install and run Ollama on a local computer.
-
Download a Sentence Transformer Model
- Select a Model
- Choose a sentence transformer model from Hugging Face.
- Download the Model
- Make sure that your system has the git-lfs command installed. See Git Large File Storage for instructions.
- Download the selected model to a local directory. For example, to download the
all-MiniLM-L6-v2
model, use the following command:git clone https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
- Provide the Model Path
- When running the
rag.py
script, provide the path to the directory of the downloaded model as a parameter. Note: Accessing a local sentence transformer model is much faster than accessing it via thesentence-transformers
Python package.
-
Go to the RAG scripts directory
cd jobs/RAG
-
Create the virtual environment (one time setup):
python -m venv .venv
-
Activate (every command-line session):
- Windows:
.\.venv\Scripts\activate
- Mac/Linux:
source .venv/bin/activate
- Windows:
-
Install Python Dependencies (Only run once for the installation)
pip install -r requirements.txt
- Run
rag.py
with a parameter providing the path to the directory of a sentence transformer modelpython rag.py ./all-MiniLM-L6-v2/
- The last two responses in the execution result shows the language model's output with and without the use of RAG.