Skip to content

Conversation

@AnuradhaKaruppiah
Copy link
Contributor

@AnuradhaKaruppiah AnuradhaKaruppiah commented Aug 10, 2025

Description

Closes: #352
This PR enables custom dataset formats by adding a new EvalDatasetCustomConfig class that allows users to specify custom Python functions for dataset transformation. The implementation provides a standardized interface for custom parsers while maintaining compatibility with existing functionality.

Adds EvalDatasetCustomConfig class for custom dataset handling
Implements custom function loading and execution with error handling
Adds preprocessing utilities to apply standard filters and transformations
Includes example implementation and documentation

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
@AnuradhaKaruppiah AnuradhaKaruppiah added improvement Improvement to existing functionality non-breaking Non-breaking change labels Aug 10, 2025
AnuradhaKaruppiah and others added 5 commits August 11, 2025 14:18
Co-authored-by: Copilot <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Co-authored-by: Copilot <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Co-authored-by: Copilot <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>

This comment was marked as outdated.

AnuradhaKaruppiah and others added 2 commits August 11, 2025 15:02
Co-authored-by: Copilot <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
…iq_simple_calculator_eval/scripts/custom_dataset_parser.py

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Copy link
Contributor

@yczhang-nv yczhang-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except an empty file in the PR

Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
This reverts commit 685c96a.
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Signed-off-by: Anuradha Karuppiah <[email protected]>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enables custom dataset formats by adding a new EvalDatasetCustomConfig class that allows users to specify custom Python functions for dataset transformation. The implementation provides a standardized interface for custom parsers while maintaining compatibility with existing functionality.

Key changes:

  • Added EvalDatasetCustomConfig class for custom dataset handling with function loading and execution
  • Implemented preprocessing utilities to apply standard filters and transformations to custom datasets
  • Updated EvalInputItem model to provide default values for optional fields

Reviewed Changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/nat/data_models/dataset_handler.py Adds EvalDatasetCustomConfig class with custom function loading capabilities
src/nat/eval/dataset_handler/dataset_handler.py Implements custom dataset handling logic and preprocessing utilities
src/nat/eval/evaluator/evaluator_model.py Updates EvalInputItem to provide default values for workflow-populated fields
tests/nat/eval/dataset_handler/test_dataset_handler.py Adds comprehensive tests for custom dataset functionality
examples/evaluation_and_profiling/simple_calculator_eval/ Provides example implementation with nested JSON parser and configuration
docs/source/reference/evaluate.md Documents custom dataset format usage and API

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

@AnuradhaKaruppiah
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 5d26b75 into NVIDIA:develop Aug 14, 2025
12 checks passed
saglave pushed a commit to snps-scm13/SNPS-NeMo-Agent-Toolkit that referenced this pull request Sep 2, 2025
Closes: NVIDIA#352
This PR enables custom dataset formats by adding a new EvalDatasetCustomConfig class that allows users to specify custom Python functions for dataset transformation. The implementation provides a standardized interface for custom parsers while maintaining compatibility with existing functionality.

Adds EvalDatasetCustomConfig class for custom dataset handling
Implements custom function loading and execution with error handling
Adds preprocessing utilities to apply standard filters and transformations
Includes example implementation and documentation

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah)

Approvers:
  - Yuchen Zhang (https://github.com/yczhang-nv)

URL: NVIDIA#615
Signed-off-by: Sangharsh Aglave <[email protected]>
@AnuradhaKaruppiah AnuradhaKaruppiah deleted the ak-customize-input-2 branch September 19, 2025 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement to existing functionality non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA]: Eval dataset changes to work with more diverse workflows

2 participants