Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow val_check_interval to be larger than training dataset size #5413

Closed
cccntu opened this issue Jan 8, 2021 · 13 comments · Fixed by #11993
Closed

allow val_check_interval to be larger than training dataset size #5413

cccntu opened this issue Jan 8, 2021 · 13 comments · Fixed by #11993
Assignees
Labels
feature Is an improvement or enhancement help wanted Open to be worked on priority: 1 Medium priority task priority: 2 Low priority task
Milestone

Comments

@cccntu
Copy link
Contributor

cccntu commented Jan 8, 2021

🚀 Feature

allow val_check_interval to be larger than the number of the training batches in one epoch

Motivation

I am using a small datasets, so instead of specifying max_epochs in Trainer, I want to use max_steps and evaluate every val_check_interval , but when val_check_interval is larger than number of batches in training set, there is an error, like this:

`val_check_interval` (100) must be less than or equal to the number of the training batches (24). If you want to disable validation set `limit_val_batches` to 0.0 instead.

Pitch

val_check_interval should't be limited by the number of the training batches in one epoch, it should be a warning, not an error

Alternatives

I am currently using a wrapper to make it an iterable dataset, so it allows me to do that

Additional context

@cccntu cccntu added feature Is an improvement or enhancement help wanted Open to be worked on labels Jan 8, 2021
@rohitgr7
Copy link
Contributor

rohitgr7 commented Jan 8, 2021

#4409 similar

@tchaton tchaton added the priority: 0 High priority task label Jan 8, 2021
@gan3sh500
Copy link
Contributor

@rohitgr7 @tchaton Could I work on this?

@rohitgr7
Copy link
Contributor

@gan3sh500 yeah go ahead.

@gan3sh500
Copy link
Contributor

@cccntu Did you try check_val_every_n_epoch for this purpose.

@rohitgr7 This issue can be fixed by removing exceptions to warning and checking trainer.total_batch_idx when val_check_batch greater than trainer.num_training_batches. Shall I open a PR?

It seems a bit redundant with check_val_every_n_epoch. Or shall we add float support for check_val_every_n_epoch?

@rohitgr7
Copy link
Contributor

rohitgr7 commented Jan 18, 2021

yeah I guess just deprecate check_val_every_n_epoch and let val_check_interval handle it using float values (1.0, 2.0, 3.0...). Also use release/1.2-dev as the base to avoid conflicts.

@gan3sh500
Copy link
Contributor

@rohitgr7 Sounds good. It'll be a bit cleaner this way. I'll make the changes for this and PR.

@edenlightning
Copy link
Contributor

@gan3sh500 any update on this?

@turian
Copy link
Contributor

turian commented Mar 6, 2021

Related to this discussion: #6253

@cccntu
Copy link
Contributor Author

cccntu commented Mar 7, 2021

@cccntu Did you try check_val_every_n_epoch for this purpose.
It seems a bit redundant with check_val_every_n_epoch. Or shall we add float support for check_val_every_n_epoch?

Sorry for the late reply. I use val_check_interval=n to evaluate every n steps, using check_val_every_n_epoch=n would be evaluating every n epochs.

Using together with max_steps=m, I can make sure the number of training steps don't change if the dataset size changes.

@kaushikb11
Copy link
Contributor

Not exactly Priority P0. Moving it to P1

@kaushikb11 kaushikb11 added priority: 1 Medium priority task and removed priority: 0 High priority task labels Apr 1, 2021
@edenlightning edenlightning modified the milestones: v1.3, v1.4 Apr 27, 2021
@Rizhiy
Copy link

Rizhiy commented May 19, 2021

Any updates on this? I really don't like training using epochs, so this feature would be quite useful for me.

@edenlightning
Copy link
Contributor

@kaushikb11 what is left TODO here?

@edenlightning edenlightning removed this from the v1.4 milestone Jun 30, 2021
@kaushikb11 kaushikb11 added the priority: 2 Low priority task label Jul 6, 2021
@shivammehta25
Copy link
Contributor

Is this change merged? if yes then in which version? It is really useful, I am not sure why it is marked P2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement help wanted Open to be worked on priority: 1 Medium priority task priority: 2 Low priority task
Projects
None yet
10 participants