-
Notifications
You must be signed in to change notification settings - Fork 416
Update the Judge LLM settings in the examples to avoid retries #204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The ragas nv_metrics require 3-8 tokens, temperature can be left at the default of 0.1. Also adjusted the LLM model based on the leadership board. Signed-off-by: Anuradha Karuppiah <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates the judge LLM settings used across various example configurations and documentation to align with the new leadership board recommendations. Key changes include updating the model name from meta/llama-3.3-70b-instruct to meta/llama-3.1-70b-instruct, removing explicit temperature and top_p parameters from the nim_rag_eval_llm configuration, and increasing the max_tokens value from 2–6 tokens to 8 tokens.
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| examples/simple/src/aiq_simple/configs/eval_upload_config.yml | Updated nim_rag_eval_llm configuration to use the new model and token count, removing unneeded temperature and top_p settings. |
| examples/simple/src/aiq_simple/configs/eval_config.yml | Adjusted nim_rag_eval_llm parameters to match the new standard. |
| examples/email_phishing_analyzer/configs/config.yml | Consistent update of nim_rag_eval_llm settings for email phishing analyzer. |
| examples/email_phishing_analyzer/configs/config-reasoning.yml | Similar update to nim_rag_eval_llm configuration. |
| examples/email_phishing_analyzer/configs/config-phi-3-mini-4k-instruct.yml | Updated nim_rag_eval_llm settings to reflect the new token count and model. |
| examples/email_phishing_analyzer/configs/config-phi-3-medium-4k-instruct.yml | Modified nim_rag_eval_llm configuration accordingly. |
| examples/email_phishing_analyzer/configs/config-mixtral-8x22b-instruct-v0.1.yml | Updated nim_rag_eval_llm to the new model name and token count. |
| examples/email_phishing_analyzer/configs/config-llama-3.3-70b-instruct.yml | Changed model references and removed explicit temperature and top_p parameters. |
| examples/email_phishing_analyzer/configs/config-llama-3.1-8b-instruct.yml | Updates mirror other nim_rag_eval_llm configurations. |
| examples/documentation_guides/workflows/text_file_ingest/src/text_file_ingest/configs/config.yml | Adjusted nim_rag_eval_llm settings for consistency with overall configuration changes. |
| docs/source/guides/evaluate.md | Documentation updated to reflect the new judge LLM model and token configuration, along with guidance on the recommended settings. |
Comments suppressed due to low confidence (2)
examples/simple/src/aiq_simple/configs/eval_upload_config.yml:42
- Ensure that the removal of explicit 'temperature' and 'top_p' entries in the nim_rag_eval_llm configuration is intentional and that the defaults (e.g., a temperature of 0.1) are correctly applied across all environments.
max_tokens: 8
docs/source/guides/evaluate.md:115
- Confirm that the updated judge LLM model name in the documentation aligns with the configuration changes across the project and reflects the intended leadership board update.
model_name: meta/llama-3.1-70b-instruct
Signed-off-by: Anuradha Karuppiah <[email protected]>
|
/merge |
…A#204) Closes NVIDIA#202 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Eric Evans II (https://github.com/ericevans-nv) URL: NVIDIA#204 Signed-off-by: Yuchen Zhang <[email protected]>
…A#204) Closes NVIDIA#202 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Eric Evans II (https://github.com/ericevans-nv) URL: NVIDIA#204 Signed-off-by: Yuchen Zhang <[email protected]>
…A#204) Closes NVIDIA#202 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Eric Evans II (https://github.com/ericevans-nv) URL: NVIDIA#204 Signed-off-by: Eric Evans <[email protected]>
…A#204) Closes NVIDIA#202 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Eric Evans II (https://github.com/ericevans-nv) URL: NVIDIA#204 Signed-off-by: Eric Evans <[email protected]>
…A#204) Closes NVIDIA#202 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Eric Evans II (https://github.com/ericevans-nv) URL: NVIDIA#204
…A#204) Closes NVIDIA#202 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Eric Evans II (https://github.com/ericevans-nv) URL: NVIDIA#204
The ragas nv_metrics require 3-8 tokens, temperature can be left at the default of 0.1.
Also adjusted the LLM model based on the leadership board.
Description
Closes #202
By Submitting this PR I confirm: