Trainer multi label #7191

sgugger · 2020-09-16T21:48:46Z

This is a follow-up from #7126. The same kinds of models that can output multiple predictions expect multiple labels (not named "labels") so the evaluation code needs to be changed for this. To support models built by users, I added a label_names field in the TrainingArguments which contain the label names. It then defaults to ["labels"] for most models, ["start_positions", "end_positions"] for question answering models if the user does not set it to work seamlessly for all Transformers models.

I ended up writing a few util functions that concat/numpify for tensors or nested lists/tuples of tensors to avoid testing everywhere in Trainer, I think the design is cleaner this way and it also supports model with crazy outputs (if we set output_attentions=True for instance). I also added a test for the multiple labels predictions.

sgugger · 2020-09-16T21:49:45Z

src/transformers/trainer.py

            Tuple[Optional[float], Optional[torch.Tensor], Optional[torch.Tensor]]:
            A tuple with the loss, logits and labels (each being optional).
        """
-        has_labels = any(inputs.get(k) is not None for k in ["labels", "lm_labels", "masked_lm_labels"])


No need to test for old deprecated argument names since they have all been changed in the lib and the user can now set their own name if they have an old model they are still using.

good! this will reduce cruft

codecov · 2020-09-16T22:01:16Z

Codecov Report

Merging #7191 into master will decrease coverage by 1.45%.
The diff coverage is 70.83%.

@@            Coverage Diff             @@
##           master    #7191      +/-   ##
==========================================
- Coverage   80.86%   79.41%   -1.46%     
==========================================
  Files         169      169              
  Lines       32293    32322      +29     
==========================================
- Hits        26115    25668     -447     
- Misses       6178     6654     +476

Impacted Files	Coverage Δ
src/transformers/trainer_utils.py	`58.88% <56.52%> (-1.41%)`	⬇️
src/transformers/trainer.py	`55.72% <80.00%> (+0.25%)`	⬆️
src/transformers/training_args.py	`91.74% <100.00%> (+0.07%)`	⬆️
...c/transformers/modeling_tf_transfo_xl_utilities.py	`10.00% <0.00%> (-76.00%)`	⬇️
src/transformers/modeling_tf_xlnet.py	`20.85% <0.00%> (-71.41%)`	⬇️
src/transformers/modeling_tf_transfo_xl.py	`19.85% <0.00%> (-68.29%)`	⬇️
src/transformers/modeling_tf_gpt2.py	`71.59% <0.00%> (-23.38%)`	⬇️
src/transformers/modeling_lxmert.py	`70.01% <0.00%> (-20.75%)`	⬇️
src/transformers/modeling_mobilebert.py	`79.21% <0.00%> (-10.25%)`	⬇️
src/transformers/modeling_openai.py	`72.25% <0.00%> (-10.00%)`	⬇️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3babef8...29ce623. Read the comment docs.

LysandreJik

LGTM, very nice addition.

LysandreJik · 2020-09-17T09:42:22Z

src/transformers/trainer_utils.py

+def nested_xla_mesh_reduce(tensors, name):
+    if is_torch_tpu_available():
+        import torch_xla.core.xla_model as xm
+
+        if isinstance(tensors, (list, tuple)):
+            return type(tensors)(nested_xla_mesh_reduce(t, f"{name}_{i}") for i, t in enumerate(tensors))
+        return xm.mesh_reduce(name, tensors, torch.cat)
+    else:
+        raise ImportError("Torch xla must be installed to use `nested_xla_mesh_reduce`")
+


Did you get a chance to test this on TPU?

No, was planning to ask you about it this morning.

* Trainer accep multiple labels * Missing import * Fix dosctrings

sgugger added 2 commits September 16, 2020 17:40

Trainer accep multiple labels

8845a04

Missing import

9e34617

sgugger requested review from LysandreJik and julien-c September 16, 2020 21:48

sgugger commented Sep 16, 2020

View reviewed changes

Fix dosctrings

29ce623

LysandreJik approved these changes Sep 17, 2020

View reviewed changes

sgugger mentioned this pull request Sep 17, 2020

Trainer evaluation doesn't return eval loss for question-answering. #5104

Closed

sgugger merged commit 492bb6a into master Sep 17, 2020

sgugger deleted the trainer_multi_label branch September 17, 2020 12:15

Zigur pushed a commit to Zigur/transformers that referenced this pull request Oct 26, 2020

Trainer multi label (huggingface#7191)

25fe84c

* Trainer accep multiple labels * Missing import * Fix dosctrings

ManavR123 mentioned this pull request Nov 8, 2020

Fixed Trainer default labels for QA Model #8399

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trainer multi label #7191

Trainer multi label #7191

Uh oh!

sgugger commented Sep 16, 2020

Uh oh!

sgugger Sep 16, 2020

Uh oh!

julien-c Sep 17, 2020

Uh oh!

codecov bot commented Sep 16, 2020

Uh oh!

LysandreJik left a comment

Uh oh!

LysandreJik Sep 17, 2020

Uh oh!

sgugger Sep 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Trainer multi label #7191

Trainer multi label #7191

Uh oh!

Conversation

sgugger commented Sep 16, 2020

Uh oh!

sgugger Sep 16, 2020

Choose a reason for hiding this comment

Uh oh!

julien-c Sep 17, 2020

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Sep 16, 2020

Codecov Report

Uh oh!

LysandreJik left a comment

Choose a reason for hiding this comment

Uh oh!

LysandreJik Sep 17, 2020

Choose a reason for hiding this comment

Uh oh!

sgugger Sep 17, 2020

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants