Skip to content

Conversation

@sgugger
Copy link
Collaborator

@sgugger sgugger commented Sep 16, 2020

This is a follow-up from #7126. The same kinds of models that can output multiple predictions expect multiple labels (not named "labels") so the evaluation code needs to be changed for this. To support models built by users, I added a label_names field in the TrainingArguments which contain the label names. It then defaults to ["labels"] for most models, ["start_positions", "end_positions"] for question answering models if the user does not set it to work seamlessly for all Transformers models.

I ended up writing a few util functions that concat/numpify for tensors or nested lists/tuples of tensors to avoid testing everywhere in Trainer, I think the design is cleaner this way and it also supports model with crazy outputs (if we set output_attentions=True for instance). I also added a test for the multiple labels predictions.

Tuple[Optional[float], Optional[torch.Tensor], Optional[torch.Tensor]]:
A tuple with the loss, logits and labels (each being optional).
"""
has_labels = any(inputs.get(k) is not None for k in ["labels", "lm_labels", "masked_lm_labels"])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to test for old deprecated argument names since they have all been changed in the lib and the user can now set their own name if they have an old model they are still using.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good! this will reduce cruft

@codecov
Copy link

codecov bot commented Sep 16, 2020

Codecov Report

Merging #7191 into master will decrease coverage by 1.45%.
The diff coverage is 70.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7191      +/-   ##
==========================================
- Coverage   80.86%   79.41%   -1.46%     
==========================================
  Files         169      169              
  Lines       32293    32322      +29     
==========================================
- Hits        26115    25668     -447     
- Misses       6178     6654     +476     
Impacted Files Coverage Δ
src/transformers/trainer_utils.py 58.88% <56.52%> (-1.41%) ⬇️
src/transformers/trainer.py 55.72% <80.00%> (+0.25%) ⬆️
src/transformers/training_args.py 91.74% <100.00%> (+0.07%) ⬆️
...c/transformers/modeling_tf_transfo_xl_utilities.py 10.00% <0.00%> (-76.00%) ⬇️
src/transformers/modeling_tf_xlnet.py 20.85% <0.00%> (-71.41%) ⬇️
src/transformers/modeling_tf_transfo_xl.py 19.85% <0.00%> (-68.29%) ⬇️
src/transformers/modeling_tf_gpt2.py 71.59% <0.00%> (-23.38%) ⬇️
src/transformers/modeling_lxmert.py 70.01% <0.00%> (-20.75%) ⬇️
src/transformers/modeling_mobilebert.py 79.21% <0.00%> (-10.25%) ⬇️
src/transformers/modeling_openai.py 72.25% <0.00%> (-10.00%) ⬇️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3babef8...29ce623. Read the comment docs.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, very nice addition.

Comment on lines +162 to +171
def nested_xla_mesh_reduce(tensors, name):
if is_torch_tpu_available():
import torch_xla.core.xla_model as xm

if isinstance(tensors, (list, tuple)):
return type(tensors)(nested_xla_mesh_reduce(t, f"{name}_{i}") for i, t in enumerate(tensors))
return xm.mesh_reduce(name, tensors, torch.cat)
else:
raise ImportError("Torch xla must be installed to use `nested_xla_mesh_reduce`")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you get a chance to test this on TPU?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, was planning to ask you about it this morning.

@sgugger sgugger merged commit 492bb6a into master Sep 17, 2020
@sgugger sgugger deleted the trainer_multi_label branch September 17, 2020 12:15
Zigur pushed a commit to Zigur/transformers that referenced this pull request Oct 26, 2020
* Trainer accep multiple labels

* Missing import

* Fix dosctrings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants