Skip to content

Conversation

@Mehrad0711
Copy link
Contributor

@Mehrad0711 Mehrad0711 commented Nov 15, 2020

return_tensors default should be "pt" in bart's prepare_seq2seq_batch.

@thomwolf
Copy link
Member

Tokenizers are supposed to be framework (PyTorch/TensorFlow/FLAX) agnostic so we probably don't want to do in that direction.

@Mehrad0711
Copy link
Contributor Author

Gotcha. Is this going to be the case for all tokenizers in the future? Because currently they default to PyTorch except for Bart's.
Also, I think the docstring for Bart tokenizer's return_tensors needs to be updated then since it says: optional, defaults to "pt"

@LysandreJik
Copy link
Member

The fact that the BART-like tokenizers have return_tensors="pt" is a mistake. The tokenizers should be framework-agnostic.

@LysandreJik
Copy link
Member

We will have to update this, which will be a breaking change, so we'll try to put it in the v4.0.0. Do you want to open a PR to fix the issue?

@Mehrad0711
Copy link
Contributor Author

Sure. I'll close this then and make a new PR for that.

@Mehrad0711 Mehrad0711 closed this Nov 16, 2020
@LysandreJik
Copy link
Member

Hi @Mehrad0711! We're rushing to [email protected], and we don't want that in this release. I've taken the liberty of fixing it in #8599. Sorry about that, I hope you had not started your development.

If you have, you can push your fixes and open a PR and I'll incorporate those changes in my PR and mark you as co-author.

@Mehrad0711
Copy link
Contributor Author

No problem @LysandreJik . Thanks for fixing it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants