Skip to content

Conversation

rolshoven
Copy link
Contributor

As part of a community task I have been collaborating on, I am planning to add multiple PRs to add functionality to lighteval that was required for our evaluations. This PR introduces support for German, French, and Italian when using the Extractiveness metric, and it adds dedicated metrics in these languages that can be used in evaluations.

Related issue: #955

I also had to update the spaCy dependency because I was running into dependency conflicts otherwise using the current pyproject toml file.

@NathanHB NathanHB linked an issue Sep 12, 2025 that may be closed by this pull request
@HuggingFaceDocBuilderDev
Copy link
Collaborator

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@NathanHB NathanHB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good ! only few nits and ready to merge :)

self.nlp = spacy.load(spacy_model)
except OSError:
logger.info("Downloading the spacy en_core_web_sm model\n(don't worry, this will only happen once)")
logger.info("Downloading the spacy %s model\n(don't worry, this will only happen once)", spacy_model)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to use f strings here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve updated it :-) Out of curiosity though: I used to rely on f-strings until a linter flagged it in another project because of this rule: https://docs.astral.sh/ruff/rules/logging-f-string/

Personally, I also prefer f-strings, but I was wondering if their use here is just a matter of preference in lighteval’s coding guidelines or if there’s another reason behind it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see, we will add this rule in a later PR, this is just to keep the formatting coherent throughout the repo !

@NathanHB
Copy link
Member

the end to end test fails after a vllm release, i will patch asap. In the meantime can you run make quality ?

@rolshoven
Copy link
Contributor Author

the end to end test fails after a vllm release, i will patch asap. In the meantime can you run make quality ?

I ran it and fixed a missing newline after the imports in src/lighteval/metrics/imports/data_stats_metric.py, now all checks pass.

@NathanHB NathanHB merged commit 2dc1788 into huggingface:main Sep 16, 2025
4 checks passed
NathanHB added a commit that referenced this pull request Sep 19, 2025
* Added German, French, and Italian language support to Extractiveness metric

* Added minimum version for spacy dependency

* Added changes from code review

* Added missing newline

---------

Co-authored-by: Nathan Habib <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FT] Multilingual Extractiveness Metrics
3 participants