Skip to content

lm_eval to 0.4.9.1 and support for new args - rebased#2228

Merged
regisss merged 15 commits into
huggingface:mainfrom
12010486:lm_eval_new_args_rebased
Sep 2, 2025
Merged

lm_eval to 0.4.9.1 and support for new args - rebased#2228
regisss merged 15 commits into
huggingface:mainfrom
12010486:lm_eval_new_args_rebased

Conversation

@12010486
Copy link
Copy Markdown
Contributor

Cherry-picked from #2193 to have it on main.

Changes are as follow:
Argument parsing and evaluation configuration:

  • Added new command-line arguments in run_lm_eval.py for controlling evaluation, including support for generation kwargs, few-shot and multi-turn settings, metadata, system instructions, chat template application, and sample selection. This allows for more granular and customizable evaluation runs.
  • Add try_parse_json to robustly handle JSON or string input for generation arguments.
  • Updated main evaluation logic to pass new arguments through to the evaluator, and added validation for combinations of options (e.g., requiring chat template when using multi-turn few-shot).

Model adapter enhancements:

  • Added support for softmax_dtype, think_end_token, enable_thinking, and chat_template_args in HabanaModelAdapter initialization, enabling more advanced generation and prompt formatting features.
  • Improved bucket selection logic for static shape generation, and switched to using max_new_tokens for generation instead of max_length, aligning with HuggingFace's recommended API usage.

Dependency update:

  • Upgraded the lm-eval package to version 0.4.9.1 to support new features and bug fixes.

General improvements:

  • Minor refactoring for imports and typing to support new features.

@12010486 12010486 requested a review from regisss as a code owner August 26, 2025 13:54
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@karol-brejna-i karol-brejna-i self-assigned this Aug 27, 2025
Copy link
Copy Markdown
Collaborator

@AKloniecki AKloniecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're currently working on running these changes in our CI, to check results after the upgrade. Will report back as soon as we get the results.

Comment thread examples/text-generation/model_adapter.py Outdated
Comment thread examples/text-generation/requirements_lm_eval.txt Outdated
Comment on lines 245 to 247
def get_model_info(self) -> dict:
"""
Patched method to get Hugging Face model information for experiment reproducibility.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update get_model_info() function to match current version at https://github.com/EleutherAI/lm-evaluation-harness/blob/v0.4.9.1/lm_eval/models/huggingface.py#L1522
It seems to have changed since our last update.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AKloniecki, thank you very much for the review. On this specific point, it might be that we can actually remove the whole get_model_info(), as that add was due to a glitch in neural_compressor_pt, I'm checking in the background if it was solved with v 3.3.1, the latest on SynapseAI 1.21

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm the error that required the get_model_info() patch is gone. Hence, my last commit is removing the function

Comment thread examples/text-generation/run_lm_eval.py
@12010486 12010486 requested a review from AKloniecki August 27, 2025 15:00
Comment thread examples/text-generation/run_lm_eval.py Outdated
Copy link
Copy Markdown
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@regisss regisss merged commit 54ddb64 into huggingface:main Sep 2, 2025
2 of 5 checks passed
@12010486 12010486 deleted the lm_eval_new_args_rebased branch September 3, 2025 07:19
gplutop7 pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Oct 15, 2025
…) (huggingface#641)

Co-authored-by: Silvia Colabrese <silvia.colabrese@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants