Multilingual extractiveness #956

rolshoven · 2025-09-11T17:19:42Z

As part of a community task I have been collaborating on, I am planning to add multiple PRs to add functionality to lighteval that was required for our evaluations. This PR introduces support for German, French, and Italian when using the Extractiveness metric, and it adds dedicated metrics in these languages that can be used in evaluations.

Related issue: #955

I also had to update the spaCy dependency because I was running into dependency conflicts otherwise using the current pyproject toml file.

…metric

HuggingFaceDocBuilderDev · 2025-09-12T12:54:08Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

NathanHB

looks good ! only few nits and ready to merge :)

src/lighteval/metrics/imports/data_stats_metric.py

NathanHB · 2025-09-12T12:53:51Z

src/lighteval/metrics/imports/data_stats_metric.py

+            self.nlp = spacy.load(spacy_model)
        except OSError:
-            logger.info("Downloading the spacy en_core_web_sm model\n(don't worry, this will only happen once)")
+            logger.info("Downloading the spacy %s model\n(don't worry, this will only happen once)", spacy_model)


better to use f strings here

I’ve updated it :-) Out of curiosity though: I used to rely on f-strings until a linter flagged it in another project because of this rule: https://docs.astral.sh/ruff/rules/logging-f-string/

Personally, I also prefer f-strings, but I was wondering if their use here is just a matter of preference in lighteval’s coding guidelines or if there’s another reason behind it.

i see, we will add this rule in a later PR, this is just to keep the formatting coherent throughout the repo !

NathanHB · 2025-09-15T12:22:53Z

the end to end test fails after a vllm release, i will patch asap. In the meantime can you run make quality ?

rolshoven · 2025-09-15T13:45:56Z

the end to end test fails after a vllm release, i will patch asap. In the meantime can you run make quality ?

I ran it and fixed a missing newline after the imports in src/lighteval/metrics/imports/data_stats_metric.py, now all checks pass.

* Added German, French, and Italian language support to Extractiveness metric * Added minimum version for spacy dependency * Added changes from code review * Added missing newline --------- Co-authored-by: Nathan Habib <[email protected]>

rolshoven and others added 3 commits September 11, 2025 19:12

Added German, French, and Italian language support to Extractiveness …

5778b0e

…metric

Added minimum version for spacy dependency

30a4a6c

Merge branch 'main' into multilingual-extractiveness

0b80de6

NathanHB linked an issue Sep 12, 2025 that may be closed by this pull request

[FT] Multilingual Extractiveness Metrics #955

Closed

NathanHB reviewed Sep 12, 2025

View reviewed changes

Added changes from code review

2334bbd

NathanHB added the feature label Sep 15, 2025

NathanHB and others added 2 commits September 15, 2025 14:41

Merge branch 'main' into multilingual-extractiveness

ce7e32f

Added missing newline

7bcffa9

Merge branch 'main' into multilingual-extractiveness

4c89266

NathanHB approved these changes Sep 16, 2025

View reviewed changes

NathanHB merged commit 2dc1788 into huggingface:main Sep 16, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multilingual extractiveness #956

Multilingual extractiveness #956

Uh oh!

rolshoven commented Sep 11, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Sep 12, 2025

Uh oh!

NathanHB left a comment

Uh oh!

Uh oh!

NathanHB Sep 12, 2025

Uh oh!

rolshoven Sep 12, 2025

Uh oh!

NathanHB Sep 15, 2025

Uh oh!

NathanHB commented Sep 15, 2025

Uh oh!

rolshoven commented Sep 15, 2025

Uh oh!

Uh oh!

Uh oh!

Multilingual extractiveness #956

Multilingual extractiveness #956

Uh oh!

Conversation

rolshoven commented Sep 11, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Sep 12, 2025

Uh oh!

NathanHB left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NathanHB Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

rolshoven Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

NathanHB Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

NathanHB commented Sep 15, 2025

Uh oh!

rolshoven commented Sep 15, 2025

Uh oh!

Uh oh!

Uh oh!