Skip to content

Conversation

@patil-suraj
Copy link
Contributor

This PR add ElectraForMultipleChoice. One of the missing models in this project.

Since, for multiple choice pooled outputs are needed, added ElectraPooler class.

@sgugger , @LysandreJik

@codecov
Copy link

codecov bot commented Jun 12, 2020

Codecov Report

Merging #4954 into master will increase coverage by 0.08%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4954      +/-   ##
==========================================
+ Coverage   77.24%   77.33%   +0.08%     
==========================================
  Files         133      133              
  Lines       22134    22166      +32     
==========================================
+ Hits        17097    17141      +44     
+ Misses       5037     5025      -12     
Impacted Files Coverage Δ
src/transformers/__init__.py 99.16% <ø> (ø)
src/transformers/modeling_auto.py 70.76% <ø> (ø)
src/transformers/configuration_electra.py 100.00% <100.00%> (ø)
src/transformers/modeling_electra.py 80.12% <100.00%> (+1.95%) ⬆️
src/transformers/modeling_utils.py 91.06% <0.00%> (+0.12%) ⬆️
src/transformers/trainer.py 39.62% <0.00%> (+0.23%) ⬆️
src/transformers/modeling_tf_utils.py 87.57% <0.00%> (+1.55%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update efeb75b...9cb71f9. Read the comment docs.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR! I think there is a way to make it more consistent with the other models for multiple choice and avoid introducing a new class, would you mind looking into it?

return outputs # (loss), start_logits, end_logits, (hidden_states), (attentions)


class ElectraPooler(nn.Module):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of introducing a new class, I'd rather use the existing SequenceSummary which can do the pool + dropout (see the XLNetForMultipleChoice for an example. It will require to add a few things to the config to make it work (see XLNetConfig for an example).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a way to make it more consistent with the other models for multiple choice and avoid introducing a new class, would you mind looking into it?

Sure, I wasn't aware of SequenceSummary. I'll try to use it instead of Pooler

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But couldn't you just use the ElectraClassificationHead for that pooling 🤔:

class ElectraClassificationHead(nn.Module):
"""Head for sentence-level classification tasks."""
def __init__(self, config):
super().__init__()
self.dense = nn.Linear(config.hidden_size, config.hidden_size)
self.dropout = nn.Dropout(config.hidden_dropout_prob)
self.out_proj = nn.Linear(config.hidden_size, config.num_labels)
def forward(self, features, **kwargs):
x = features[:, 0, :] # take <s> token (equiv. to [CLS])
x = self.dropout(x)
x = self.dense(x)
x = get_activation("gelu")(x) # although BERT uses tanh here, it seems Electra authors used gelu here
x = self.dropout(x)
x = self.out_proj(x)
return x

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly no, because it adds a linear layer to num_labels

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @LysandreJik it's good for your review.

@sgugger sgugger requested a review from LysandreJik June 12, 2020 18:23
Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @patil-suraj

@patil-suraj
Copy link
Contributor Author

hi @LysandreJik could you tell me why these tests are failing now ? Thanks.

@sgugger
Copy link
Collaborator

sgugger commented Jun 18, 2020

Looks like @LysandreJik deleted a function by mistake during the merge (the create_and_check_electra_for_multiple_choice), could you add it back?

@sgugger sgugger merged commit ca2d0f9 into huggingface:master Jun 18, 2020
@LysandreJik
Copy link
Member

My bad, sorry about that. Thanks for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants