Skip to content

Conversation

@AnuradhaKaruppiah
Copy link
Contributor

The ragas nv_metrics require 3-8 tokens, temperature can be left at the default of 0.1.
Also adjusted the LLM model based on the leadership board.

Description

Closes #202

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

The ragas nv_metrics require 3-8 tokens, temperature can be left at the
default of 0.1.
Also adjusted the LLM model based on the leadership board.

Signed-off-by: Anuradha Karuppiah <[email protected]>
@AnuradhaKaruppiah AnuradhaKaruppiah added improvement Improvement to existing functionality non-breaking Non-breaking change labels May 2, 2025
@AnuradhaKaruppiah AnuradhaKaruppiah self-assigned this May 2, 2025
@AnuradhaKaruppiah AnuradhaKaruppiah requested a review from Copilot May 2, 2025 19:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the judge LLM settings used across various example configurations and documentation to align with the new leadership board recommendations. Key changes include updating the model name from meta/llama-3.3-70b-instruct to meta/llama-3.1-70b-instruct, removing explicit temperature and top_p parameters from the nim_rag_eval_llm configuration, and increasing the max_tokens value from 2–6 tokens to 8 tokens.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file
File Description
examples/simple/src/aiq_simple/configs/eval_upload_config.yml Updated nim_rag_eval_llm configuration to use the new model and token count, removing unneeded temperature and top_p settings.
examples/simple/src/aiq_simple/configs/eval_config.yml Adjusted nim_rag_eval_llm parameters to match the new standard.
examples/email_phishing_analyzer/configs/config.yml Consistent update of nim_rag_eval_llm settings for email phishing analyzer.
examples/email_phishing_analyzer/configs/config-reasoning.yml Similar update to nim_rag_eval_llm configuration.
examples/email_phishing_analyzer/configs/config-phi-3-mini-4k-instruct.yml Updated nim_rag_eval_llm settings to reflect the new token count and model.
examples/email_phishing_analyzer/configs/config-phi-3-medium-4k-instruct.yml Modified nim_rag_eval_llm configuration accordingly.
examples/email_phishing_analyzer/configs/config-mixtral-8x22b-instruct-v0.1.yml Updated nim_rag_eval_llm to the new model name and token count.
examples/email_phishing_analyzer/configs/config-llama-3.3-70b-instruct.yml Changed model references and removed explicit temperature and top_p parameters.
examples/email_phishing_analyzer/configs/config-llama-3.1-8b-instruct.yml Updates mirror other nim_rag_eval_llm configurations.
examples/documentation_guides/workflows/text_file_ingest/src/text_file_ingest/configs/config.yml Adjusted nim_rag_eval_llm settings for consistency with overall configuration changes.
docs/source/guides/evaluate.md Documentation updated to reflect the new judge LLM model and token configuration, along with guidance on the recommended settings.
Comments suppressed due to low confidence (2)

examples/simple/src/aiq_simple/configs/eval_upload_config.yml:42

  • Ensure that the removal of explicit 'temperature' and 'top_p' entries in the nim_rag_eval_llm configuration is intentional and that the defaults (e.g., a temperature of 0.1) are correctly applied across all environments.
max_tokens: 8

docs/source/guides/evaluate.md:115

  • Confirm that the updated judge LLM model name in the documentation aligns with the configuration changes across the project and reflects the intended leadership board update.
model_name: meta/llama-3.1-70b-instruct

Signed-off-by: Anuradha Karuppiah <[email protected]>
@AnuradhaKaruppiah
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 06c8aeb into NVIDIA:develop May 2, 2025
10 checks passed
@AnuradhaKaruppiah AnuradhaKaruppiah deleted the eval-config branch May 6, 2025 00:48
yczhang-nv pushed a commit to yczhang-nv/NeMo-Agent-Toolkit that referenced this pull request May 8, 2025
…A#204)

Closes NVIDIA#202

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah)

Approvers:
  - Eric Evans II (https://github.com/ericevans-nv)

URL: NVIDIA#204
Signed-off-by: Yuchen Zhang <[email protected]>
yczhang-nv pushed a commit to yczhang-nv/NeMo-Agent-Toolkit that referenced this pull request May 9, 2025
…A#204)

Closes NVIDIA#202

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah)

Approvers:
  - Eric Evans II (https://github.com/ericevans-nv)

URL: NVIDIA#204
Signed-off-by: Yuchen Zhang <[email protected]>
ericevans-nv pushed a commit to ericevans-nv/agent-iq that referenced this pull request Jun 3, 2025
…A#204)

Closes NVIDIA#202

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah)

Approvers:
  - Eric Evans II (https://github.com/ericevans-nv)

URL: NVIDIA#204
Signed-off-by: Eric Evans <[email protected]>
ericevans-nv pushed a commit to ericevans-nv/agent-iq that referenced this pull request Jun 3, 2025
…A#204)

Closes NVIDIA#202

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah)

Approvers:
  - Eric Evans II (https://github.com/ericevans-nv)

URL: NVIDIA#204
Signed-off-by: Eric Evans <[email protected]>
AnuradhaKaruppiah added a commit to AnuradhaKaruppiah/oss-agentiq that referenced this pull request Aug 4, 2025
…A#204)

Closes NVIDIA#202 

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah)

Approvers:
  - Eric Evans II (https://github.com/ericevans-nv)

URL: NVIDIA#204
scheckerNV pushed a commit to scheckerNV/aiq-factory-reset that referenced this pull request Aug 22, 2025
…A#204)

Closes NVIDIA#202 

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah)

Approvers:
  - Eric Evans II (https://github.com/ericevans-nv)

URL: NVIDIA#204
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement to existing functionality non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA]: Update examples to use optimized config for nim_rag_eval_llm

2 participants