lm_eval to 0.4.9.1 and support for new args - rebased#2228
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
AKloniecki
left a comment
There was a problem hiding this comment.
We're currently working on running these changes in our CI, to check results after the upgrade. Will report back as soon as we get the results.
| def get_model_info(self) -> dict: | ||
| """ | ||
| Patched method to get Hugging Face model information for experiment reproducibility. |
There was a problem hiding this comment.
Please update get_model_info() function to match current version at https://github.com/EleutherAI/lm-evaluation-harness/blob/v0.4.9.1/lm_eval/models/huggingface.py#L1522
It seems to have changed since our last update.
There was a problem hiding this comment.
@AKloniecki, thank you very much for the review. On this specific point, it might be that we can actually remove the whole get_model_info(), as that add was due to a glitch in neural_compressor_pt, I'm checking in the background if it was solved with v 3.3.1, the latest on SynapseAI 1.21
There was a problem hiding this comment.
I can confirm the error that required the get_model_info() patch is gone. Hence, my last commit is removing the function
…) (huggingface#641) Co-authored-by: Silvia Colabrese <silvia.colabrese@intel.com>
Cherry-picked from #2193 to have it on
main.Changes are as follow:
Argument parsing and evaluation configuration:
run_lm_eval.pyfor controlling evaluation, including support for generation kwargs, few-shot and multi-turn settings, metadata, system instructions, chat template application, and sample selection. This allows for more granular and customizable evaluation runs.try_parse_jsonto robustly handle JSON or string input for generation arguments.Model adapter enhancements:
softmax_dtype,think_end_token,enable_thinking, andchat_template_argsinHabanaModelAdapterinitialization, enabling more advanced generation and prompt formatting features.max_new_tokensfor generation instead ofmax_length, aligning with HuggingFace's recommended API usage.Dependency update:
lm-evalpackage to version 0.4.9.1 to support new features and bug fixes.General improvements: