The goal of this solution accelerator is to demonstrate how you can create a QA bot using open sourced models. Due to privacy requirements, many customers can't use proprietary models. This has been designed so that you can easily switch between the newest and best LLMs from Huggingface.
A customer can easily take this solution accelerator and replace it with a knowledge base of articles (i.e. in PDF format) to replicate the application that's been built.
- Databricks Vector search (previously FAISS)
- Different model types
- Foundational Model (Notebooks 2.1->3.1->4.1)
- External Model (Notebooks 2.1->3.1->4.1)
- Custom PyFunc model (Notebooks 2.3->3.3->4.3)
- Compound AI models (with Langchain Agents using Wikipedia) (Notebooks 2.2)
- UC Model registry (Notebooks 5)
- Self hosted Gradio app (previously on huggingface.co/spaces) (Notebook 6)
- [email protected]
- [email protected],
- Testing - [email protected]
- Thanks to Bala Amavasai for his valuable assistance
© 2023 Databricks, Inc. All rights reserved. The source in this notebook is provided subject to the Databricks License [https://databricks.com/db-license-source]. All included or referenced third party libraries are subject to the licenses set forth below.
library | description | license | source |
---|---|---|---|
Langchain | Develop LLM applications | MIT | https://pypi.org/project/langchain |
Huggingface | Huggingface is a hub LLM apps | Apache 2.0 | https://pypi.org/project/huggingface/ |
Gradio | Build Machine Learning Web Apps in Python | Apache 2.0 | https://pypi.org/project/gradio/ |
Although specific solutions can be downloaded from our websites, we recommend cloning these repositories onto your databricks environment. Not only will you get access to latest code, but you will be part of a community of experts driving industry best practices and re-usable solutions, influencing our respective industries.
To start using a solution accelerator in Databricks simply follow these steps:
- Clone solution accelerator repository in Databricks using Databricks Repos
- Attach the RUNME notebook to any cluster running a DBR 11.0 or later runtime, and execute the notebook via Run-All. A multi-step-job describing the accelerator pipeline will be created, and the link will be provided. Execute the multi-step-job to see how the pipeline runs.
- Within /utils, you can edit the types of models you would like to test out from Huggingface. You will want to follow the instructions there to set up your huggingface token. Additionally, you can change the configs here to point to your knowledge base to create your own vector database.
The cost associated with running the accelerator is the user's responsibility.
Please note the code in this project is provided for your exploration only, and are not formally supported by Databricks with Service Level Agreements (SLAs). They are provided AS-IS and we do not make any guarantees of any kind. Please do not submit a support ticket relating to any issues arising from the use of these projects. The source in this project is provided subject to the Databricks License. All included or referenced third party libraries are subject to the licenses set forth below.
Any issues discovered through the use of this project should be filed as GitHub Issues on the Repo. They will be reviewed as time permits, but there are no formal SLAs for support.