Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix gpt trainer test #6915
Fix gpt trainer test #6915
Changes from 152 commits
5f0d53b
e6865c8
de3cc19
87ce549
74c6fd8
98da36e
9656fa9
7c240b3
c22cfd6
83ff763
afa9030
ae0f1f0
a4588f4
689f96c
6d7c857
3bbc4e7
7539545
bad78a6
f2405bc
d74f8d9
cadfaca
6c4950c
2f6d963
9b00e69
ea6968b
e96f4cd
a347456
c88f17f
b8616f4
3f7074d
b6e0572
9292781
63f05cc
4e91a40
20173e6
db27d89
855dbec
6f36a52
546a5d6
2de6e3f
7daf79e
0c3da74
fca0024
6735077
e33d255
da33fc8
f383adb
711b122
b3d01bd
aa756c5
caebbd5
2a54be3
7099fc4
f9396f5
1b43a95
2d6109b
5057605
463e9a6
c4bad4b
5e5cf2a
61bf69a
734b5df
5bba479
179039a
c1bf6e3
516f942
e456359
3b7a758
4a62efb
fc37a11
e7aba8e
2372ec5
f305e1f
54ed4da
aede844
e844073
9d6aaef
3bca274
b01949e
b7a82c6
01b0465
1fb3048
c7c936d
2323131
c9329c7
8fd391e
4dd4840
5413add
6557e07
3dda9ef
be7bb04
cbd68dd
9cd83ce
6715fd0
ad22848
b83a7bb
9adcd83
986fc2e
de81d72
4b40783
2abb44e
7b561cd
c73d0bb
c85e135
c33f7b4
1ec15bc
3315ed2
b7d114d
23af796
a9fba67
14a09f1
0b019a1
f80b651
febfc0f
2ba5aed
bc28be9
5c61dfb
35f2613
806b88e
3fade76
818a679
12bbae1
ba2548d
84027bf
535bbcd
44bc444
bd2aa94
650d5ea
dfcec5b
4f82530
64f9453
a14f79d
8f9598c
0125c58
727c750
ea3710b
dfac0f0
0ab4eeb
df4d4c1
f457c3f
1007152
407aae3
4c712ae
b99f522
2b59de8
ee74e54
9b9626b
67ec1b2
82a940c
c51754e
ac1241f
29f775c
cf2f2b9
6764801
9f936c0
3d3fccc
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't change the default behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we drop last if we evaluate with the validation dataset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one use case that I have to drop the last because I am preparing a dataset that computes contrastive loss. I need to make sure all the batch has the same batch size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we set
drop_last=False
, we can usepad_samples_to_global_batch_size=True
right? Or this doesn't fulfill your settings? https://github.com/NVIDIA/NeMo/blob/fix-gpt-trainer-test/nemo/collections/nlp/models/language_modeling/megatron_gpt_sft_model.py#L788If we set default
drop_last=True
, then we may drop some samples for evaluation which may show incorrect results for comparison.