Skip to content

Search and indexing your own Google Drive Files using GPT3, LangChain, and Python

Notifications You must be signed in to change notification settings

venuv/langchain_semantic_search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

Search and indexing your own Google Drive Files using GPT3, LangChain, and Python.

The jupyter notebook included here (langchain_semantic_search.ipynb) will enable you to build a FAISS index on your document corpus of interest, and search it using semantic search. Details of this flowchart are described in https://medium.com/@venuv62/can-chatgpt-be-your-bff-code-companion-4375fd73ec3a.

image

I've provided a test directory of Neuromodulation papers if you want to as a sample Drive folder to test against - https://drive.google.com/drive/folders/1eIBnSO7MVOW9-BKPCJhs7JuBDRyXPOFC?usp=sharing. Since the code needs a Google Drive directory path (not an https URL) to work with, you will have to :

  • copy the contents of this directory into a GDrive subdirectory of your own
  • set the gdrive_path variable in the jupyter notebook appropriately
  • set the question within print_answer to 'is sleep a health epidemic' for instance, which should give you a non-null answer

I will be working on a few enhancements to speed up the indexing (perhaps using a Vectorstore) and to optimize the query cost (using ideas from https://gpt-index.readthedocs.io/en/latest/how_to/cost_analysis.html)

About

Search and indexing your own Google Drive Files using GPT3, LangChain, and Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published