Avoid accessing .dataset of a DataLoader in Trainer#16451
Avoid accessing .dataset of a DataLoader in Trainer#16451sgugger merged 16 commits intohuggingface:mainfrom
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
|
@sgugger this should be ready for review.
This implementation works for my particular case, giving the same output in training+evaluation as before, but without the really painful workarounds. I had a look at tests and they look complicated, so I will add some after getting confirmation that this is ok otherwise. |
sgugger
left a comment
There was a problem hiding this comment.
Thanks for your PR! It tries to do too much at the same time however. There is no reason to change the signature of the function get_train_dataloader so it should be left as is IMO. Even if it's a change we would like to implement, it should be done on it own separate PR.
Then there is a lot of code that could be refactored using the has_length function (and improving it a tiny bit).
Lastly, this PR breaks the current logging of number of examples, this should be fixed before we can merge it.
src/transformers/trainer.py
Outdated
| len_dataloader = None | ||
| try: | ||
| len_dataloader = len(train_dataloader) | ||
| except (NameError, TypeError): # Default dataloader calls len(dataset), which may not exist |
There was a problem hiding this comment.
We have a function has_length that would simplify the code greatly here, we can add the NameError inside it.
There was a problem hiding this comment.
Refactoring as suggested, although has_length is a bit of a confusing name for __len__ does not raise an exception"
| ) | ||
|
|
||
| logger.info("***** Running training *****") | ||
| logger.info(f" Num examples = {num_examples}") |
There was a problem hiding this comment.
The code will error here since you're not defining num_examples anymore.
There was a problem hiding this comment.
num_examples was moved up inside the if statements that deal with the len/steps/size cases
| num_train_epochs = math.ceil(args.num_train_epochs) | ||
| num_train_samples = len(self.train_dataset) * args.num_train_epochs | ||
| else: | ||
| # see __init__. max_steps is set when the dataset has no __len__ |
There was a problem hiding this comment.
Note that this comment was incorrect, it would still be -1 which causes strange outputs. Have change it to make it explicit that this should be set.
sgugger
left a comment
There was a problem hiding this comment.
Thanks for adapting, I added a few comments on the tests.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
…ransformers into trainer-better-length-check
|
Thanks for implementing all the tweaks! |
What does this PR do?
__len__and__iter__) without additional requirements.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.