tokenization_bart.py: return_tensors default should be "pt" #8556

Mehrad0711 · 2020-11-15T23:31:53Z

return_tensors default should be "pt" in bart's prepare_seq2seq_batch.

thomwolf · 2020-11-16T09:40:47Z

Tokenizers are supposed to be framework (PyTorch/TensorFlow/FLAX) agnostic so we probably don't want to do in that direction.

Mehrad0711 · 2020-11-16T17:51:53Z

Gotcha. Is this going to be the case for all tokenizers in the future? Because currently they default to PyTorch except for Bart's.
Also, I think the docstring for Bart tokenizer's return_tensors needs to be updated then since it says: optional, defaults to "pt"

LysandreJik · 2020-11-16T20:08:44Z

The fact that the BART-like tokenizers have return_tensors="pt" is a mistake. The tokenizers should be framework-agnostic.

LysandreJik · 2020-11-16T20:09:23Z

We will have to update this, which will be a breaking change, so we'll try to put it in the v4.0.0. Do you want to open a PR to fix the issue?

Mehrad0711 · 2020-11-16T20:17:18Z

Sure. I'll close this then and make a new PR for that.

LysandreJik · 2020-11-17T16:56:48Z

Hi @Mehrad0711! We're rushing to [email protected], and we don't want that in this release. I've taken the liberty of fixing it in #8599. Sorry about that, I hope you had not started your development.

If you have, you can push your fixes and open a PR and I'll incorporate those changes in my PR and mark you as co-author.

Mehrad0711 · 2020-11-17T17:58:25Z

No problem @LysandreJik . Thanks for fixing it!

tokenization_bart.py: return_tensors default should be "pt"

5b24c3e

Mehrad0711 closed this Nov 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tokenization_bart.py: return_tensors default should be "pt" #8556

tokenization_bart.py: return_tensors default should be "pt" #8556

Uh oh!

Mehrad0711 commented Nov 15, 2020 •

edited

Loading

Uh oh!

thomwolf commented Nov 16, 2020

Uh oh!

Mehrad0711 commented Nov 16, 2020

Uh oh!

LysandreJik commented Nov 16, 2020

Uh oh!

LysandreJik commented Nov 16, 2020

Uh oh!

Mehrad0711 commented Nov 16, 2020

Uh oh!

LysandreJik commented Nov 17, 2020

Uh oh!

Mehrad0711 commented Nov 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tokenization_bart.py: return_tensors default should be "pt" #8556

tokenization_bart.py: return_tensors default should be "pt" #8556

Uh oh!

Conversation

Mehrad0711 commented Nov 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomwolf commented Nov 16, 2020

Uh oh!

Mehrad0711 commented Nov 16, 2020

Uh oh!

LysandreJik commented Nov 16, 2020

Uh oh!

LysandreJik commented Nov 16, 2020

Uh oh!

Mehrad0711 commented Nov 16, 2020

Uh oh!

LysandreJik commented Nov 17, 2020

Uh oh!

Mehrad0711 commented Nov 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Mehrad0711 commented Nov 15, 2020 •

edited

Loading