feat(rag): add dual LLM RAG example#1971
feat(rag): add dual LLM RAG example#1971holtskinner merged 14 commits intoGoogleCloudPlatform:mainfrom
Conversation
…ertex_rag_demo_dual_llms.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…ertex_rag_demo_dual_llms.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Hello @AutoViML, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
This pull request introduces a new feature: a dual LLM RAG (Retrieval-Augmented Generation) example. This demo allows comparison of responses from two different LLMs using Vertex AI Search over a corpus of documents. It also includes an optional judge model to evaluate the two responses and determine which one is better aligned with the query and context. The demo is built using Streamlit and includes instructions for customization, such as modifying prompts for different use cases.
Highlights
- Dual LLM Comparison: Enables comparing responses from two LLMs using Vertex AI Search.
- Judge Model Evaluation: Option to use a judge model to evaluate and compare the responses from the two LLMs.
- Customizable Prompts: Provides instructions on how to customize the prompts for different use cases by modifying text files in the
promptsfolder. - Streamlit Demo: Implements a Streamlit application for easy interaction and demonstration of the dual LLM RAG system.
Changelog
Click here to see the changelog
- search/retrieval-augmented-generation/rag_with_dual_llms/README.md
- Added a README file with instructions on how to run the demo, including setting up Google Cloud SDK, authenticating, installing requirements, and running the Streamlit app.
- Includes instructions on how to customize the prompts for different use cases.
- search/retrieval-augmented-generation/rag_with_dual_llms/requirements.txt
- Added necessary dependencies such as streamlit, ollama, langchain, google-generativeai, requests, PyPDF2, tiktoken, faiss-cpu, pandas, fsspec, gcsfs, google-cloud-aiplatform, google-cloud-discoveryengine, google-genai, langchain-google-vertexai, langchain-google-community, asynciolimiter, asyncio, tqdm, re, and json.
- search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/judge_model_name.txt
- Added a text file containing the name of the judge model to be used (gemini-2.0-flash-thinking-exp-01-21).
- search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/judge_prompt.txt
- Added a prompt for the judge model, instructing it to analyze two responses and provide a judgment on which response is better and why.
- search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/rephraser.txt
- Added a prompt for rephrasing user queries into concise search engine queries, specifically tailored for a chef advisor use case.
- search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/summarizer.txt
- Added a prompt for summarizing documents retrieved from a recipe database, tailored for an Asian cuisine advisor use case.
- search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/system_instruction.txt
- Added system instructions for the AI chatbot, defining its role as a cooking assistance chatbot and preventing mission changes.
- search/retrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py
- Created a Streamlit application that compares responses from two LLMs using Vertex AI Search.
- Includes functionality for rephrasing queries, retrieving relevant documents, generating summaries, and evaluating responses with an optional judge model.
- Added model selection, RAG setup, and chat history management.
- Implements logic for generating responses using Gemini and Ollama models.
- Includes argument parsing for enabling the judge model via command line.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Two models converse with might,
RAG weaves knowledge in the light,
A judge then casts a discerning eye,
Which answer soars, reaching for the sky?
In AI's realm, the best we try.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request introduces a dual LLM RAG example, which is a valuable addition to the repository. The code provides a Streamlit-based demo for comparing responses from two LLMs using Vertex AI Search, and includes a judge model for evaluating the responses. Overall, the implementation is well-structured and includes helpful features such as customizable prompts and clear instructions. However, there are a few areas that could be improved to enhance the code's robustness and maintainability.
Summary of Findings
- Error Handling in Prompt Loading: The prompt loading functions (
load_text_file,_load_prompt_template) include error handling, but the error messages displayed to the user could be more informative. Consider providing more context or specific instructions to help users resolve the issue. - Model Initialization: The
initialize_modelsfunction attempts to list both Gemini and Ollama models. The error handling for Ollama model listing could be improved to provide more specific guidance to the user, such as checking if the Ollama server is running or if the correct URL is being used. - Code Clarity and Maintainability: In several functions, there are opportunities to improve code clarity and maintainability by reducing redundancy and simplifying complex logic. For example, the
generate_gemini_responseandgenerate_ollama_responsefunctions could benefit from a shared error handling mechanism.
Merge Readiness
The pull request is a valuable contribution and is mostly well-implemented. However, before merging, it's recommended to address the identified issues, particularly those related to error handling and code clarity. Addressing these points will improve the robustness and maintainability of the code. I am unable to approve this pull request, and users should have others review and approve this code before merging.
...etrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py
Outdated
Show resolved
Hide resolved
...etrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py
Outdated
Show resolved
Hide resolved
...etrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py
Outdated
Show resolved
Hide resolved
...etrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py
Outdated
Show resolved
Hide resolved
search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/judge_model_name.txt
Outdated
Show resolved
Hide resolved
search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/system_instruction.txt
Outdated
Show resolved
Hide resolved
search/retrieval-augmented-generation/rag_with_dual_llms/requirements.txt
Outdated
Show resolved
Hide resolved
search/retrieval-augmented-generation/rag_with_dual_llms/requirements.txt
Outdated
Show resolved
Hide resolved
...etrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py
Outdated
Show resolved
Hide resolved
...etrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py
Outdated
Show resolved
Hide resolved
...etrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py
Show resolved
Hide resolved
search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/judge_prompt.txt
Show resolved
Hide resolved
...etrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py
Outdated
Show resolved
Hide resolved
...etrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py
Show resolved
Hide resolved
|
@AutoViML Can you please resolve the remaining spelling errors and lint errors? |
Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
CONTRIBUTINGGuide.CODEOWNERSfor the file(s).nox -s formatfrom the repository root to format).Fixes #<issue_number_goes_here> 🦕