feat(rag): add dual LLM RAG example by AutoViML · Pull Request #1971 · GoogleCloudPlatform/generative-ai

AutoViML · 2025-04-15T14:45:11Z

Description

Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

[X ] Follow the CONTRIBUTING Guide.
[X ] You are listed as the author in your notebook or README file.
- [ X] Your account is listed in CODEOWNERS for the file(s).
[ X] Make your Pull Request title in the https://www.conventionalcommits.org/ specification.
[ X] Ensure the tests and linter pass (Run nox -s format from the repository root to format).
[ X] Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

…eration

…ertex_rag_demo_dual_llms.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

gemini-code-assist

Hello @AutoViML, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request introduces a new feature: a dual LLM RAG (Retrieval-Augmented Generation) example. This demo allows comparison of responses from two different LLMs using Vertex AI Search over a corpus of documents. It also includes an optional judge model to evaluate the two responses and determine which one is better aligned with the query and context. The demo is built using Streamlit and includes instructions for customization, such as modifying prompts for different use cases.

Highlights

Dual LLM Comparison: Enables comparing responses from two LLMs using Vertex AI Search.
Judge Model Evaluation: Option to use a judge model to evaluate and compare the responses from the two LLMs.
Customizable Prompts: Provides instructions on how to customize the prompts for different use cases by modifying text files in the prompts folder.
Streamlit Demo: Implements a Streamlit application for easy interaction and demonstration of the dual LLM RAG system.

Changelog

Click here to see the changelog

search/retrieval-augmented-generation/rag_with_dual_llms/README.md
- Added a README file with instructions on how to run the demo, including setting up Google Cloud SDK, authenticating, installing requirements, and running the Streamlit app.
- Includes instructions on how to customize the prompts for different use cases.
search/retrieval-augmented-generation/rag_with_dual_llms/requirements.txt
- Added necessary dependencies such as streamlit, ollama, langchain, google-generativeai, requests, PyPDF2, tiktoken, faiss-cpu, pandas, fsspec, gcsfs, google-cloud-aiplatform, google-cloud-discoveryengine, google-genai, langchain-google-vertexai, langchain-google-community, asynciolimiter, asyncio, tqdm, re, and json.
search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/judge_model_name.txt
- Added a text file containing the name of the judge model to be used (gemini-2.0-flash-thinking-exp-01-21).
search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/judge_prompt.txt
- Added a prompt for the judge model, instructing it to analyze two responses and provide a judgment on which response is better and why.
search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/rephraser.txt
- Added a prompt for rephrasing user queries into concise search engine queries, specifically tailored for a chef advisor use case.
search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/summarizer.txt
- Added a prompt for summarizing documents retrieved from a recipe database, tailored for an Asian cuisine advisor use case.
search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/system_instruction.txt
- Added system instructions for the AI chatbot, defining its role as a cooking assistance chatbot and preventing mission changes.
search/retrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py
- Created a Streamlit application that compares responses from two LLMs using Vertex AI Search.
- Includes functionality for rephrasing queries, retrieving relevant documents, generating summaries, and evaluating responses with an optional judge model.
- Added model selection, RAG setup, and chat history management.
- Implements logic for generating responses using Gemini and Ollama models.
- Includes argument parsing for enabling the judge model via command line.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Two models converse with might,
RAG weaves knowledge in the light,
A judge then casts a discerning eye,
Which answer soars, reaching for the sky?
In AI's realm, the best we try.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a dual LLM RAG example, which is a valuable addition to the repository. The code provides a Streamlit-based demo for comparing responses from two LLMs using Vertex AI Search, and includes a judge model for evaluating the responses. Overall, the implementation is well-structured and includes helpful features such as customizable prompts and clear instructions. However, there are a few areas that could be improved to enhance the code's robustness and maintainability.

Summary of Findings

Error Handling in Prompt Loading: The prompt loading functions (load_text_file, _load_prompt_template) include error handling, but the error messages displayed to the user could be more informative. Consider providing more context or specific instructions to help users resolve the issue.
Model Initialization: The initialize_models function attempts to list both Gemini and Ollama models. The error handling for Ollama model listing could be improved to provide more specific guidance to the user, such as checking if the Ollama server is running or if the correct URL is being used.
Code Clarity and Maintainability: In several functions, there are opportunities to improve code clarity and maintainability by reducing redundancy and simplifying complex logic. For example, the generate_gemini_response and generate_ollama_response functions could benefit from a shared error handling mechanism.

Merge Readiness

The pull request is a valuable contribution and is mostly well-implemented. However, before merging, it's recommended to address the identified issues, particularly those related to error handling and code clarity. Addressing these points will improve the robustness and maintainability of the code. I am unable to approve this pull request, and users should have others review and approve this code before merging.

...etrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py

search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/judge_model_name.txt

search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/system_instruction.txt

search/retrieval-augmented-generation/rag_with_dual_llms/requirements.txt

...etrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py

search/retrieval-augmented-generation/rag_with_dual_llms/src/prompts/judge_prompt.txt

...etrieval-augmented-generation/rag_with_dual_llms/src/vertex_rag_demo_dual_llms_with_judge.py

holtskinner · 2025-05-12T16:03:55Z

@AutoViML Can you please resolve the remaining spelling errors and lint errors?

Ram Seshadri and others added 7 commits March 27, 2025 21:04

Adding rag_with_dual_LLMs folder under search/retrieval_augmented_gen…

fabdba0

…eration

Update search/retrieval-augmented-generation/rag_with_dual_llms/src/v…

637c15a

…ertex_rag_demo_dual_llms.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update search/retrieval-augmented-generation/rag_with_dual_llms/src/v…

2ef83a3

…ertex_rag_demo_dual_llms.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Move Image to GCS

97c5cc9

Vertex RAG with dual LLMs with Judge model

0130467

Vertex RAG with dual LLMs with Judge model

7739587

Modifying README file

695d8da

AutoViML requested a review from a team as a code owner April 15, 2025 14:45

gemini-code-assist bot reviewed Apr 15, 2025

View reviewed changes

holtskinner requested changes Apr 15, 2025

View reviewed changes

holtskinner assigned AutoViML Apr 15, 2025

AutoViML requested a review from holtskinner April 15, 2025 19:58

Ram Seshadri and others added 7 commits April 15, 2025 20:12

Updated changes to align with reviewer comments

f9067d8

feat(rag): add dual LLM RAG example

2fbf47d

Formatting

59c95fb

feat(rag): add dual LLM RAG example

622b277

formatting

ab5593e

spelling

10f1f56

Merge branch 'main' into main

d683499

holtskinner merged commit 08814d9 into GoogleCloudPlatform:main May 30, 2025
4 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rag): add dual LLM RAG example#1971

feat(rag): add dual LLM RAG example#1971
holtskinner merged 14 commits intoGoogleCloudPlatform:mainfrom
AutoViML:main

AutoViML commented Apr 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

holtskinner commented May 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AutoViML commented Apr 15, 2025

Description

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

holtskinner commented May 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants