Retrieval Augmented Generation (RAG) with LangChain and OpenAI

This project implements RAG using OpenAI's embedding models and LangChain's Python library. The aim is to make a user-friendly RAG application with the ability to ingest data from multiple sources (word, pdf, txt, youtube, wikipedia)

Domain areas include:

Document splitting
Embeddings (OpenAI)
Vector database (Chroma / FAISS)
Semantic search types
Retrieval chain

Upcoming works:

Introduce conversation retriever and memory states
Log embedding costs

20231216 Logging & token tracing:

Added a logger
Added callback to record cost of queries

20231215 Refactoring:

rewrote main operations into OOP
added resource caching
archived old code

20231213 MVP3:

refactored code to use tempfile to utilize langchain's loaders
added functionality to allow srt files
added webbaseloader and youtube loader
added an option to use Wikipedia as the retriever instead
added brief documentation
added debug mode (exceptions will be raised)

20231210 Refactoring and updates:

refactored to work on modules
allowed for wikipedia query with RAG

20231207 Refactoring and updates:

refactored to use yaml config file
allowed for txt and docx files

20231203 Updates:

added status spinners
updated tooltips

20231202 MVP2:

Incorporated types of different query chains - restricted query, creative query
Incorporated temperature settings
Restructured functions to get functions
Included explanations on the frontend and backend workings
Included examples

20231201 Fixes and MVP1:

chroma was changed to 0.3.29 for streamlit - did not work, reverted
switched to FAISS vector db from Chroma db due to compatibility issues with Streamlit (sqlite versioning)
removed pywin32 from library, streamlit is unable to install this dependency

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
.devcontainer		.devcontainer
.streamlit		.streamlit
archive		archive
images		images
modules		modules
notebooks		notebooks
pages		pages
.gitignore		.gitignore
01_Home.py		01_Home.py
config.yml		config.yml
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Retrieval Augmented Generation (RAG) with LangChain and OpenAI

Upcoming works:

20231216 Logging & token tracing:

20231215 Refactoring:

20231213 MVP3:

20231210 Refactoring and updates:

20231207 Refactoring and updates:

20231203 Updates:

20231202 MVP2:

20231201 Fixes and MVP1:

About

Releases

Packages

Languages

sienlonglim/LangChain

Folders and files

Latest commit

History

Repository files navigation

Retrieval Augmented Generation (RAG) with LangChain and OpenAI

Upcoming works:

20231216 Logging & token tracing:

20231215 Refactoring:

20231213 MVP3:

20231210 Refactoring and updates:

20231207 Refactoring and updates:

20231203 Updates:

20231202 MVP2:

20231201 Fixes and MVP1:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages