You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add an argument check_val_every_n_steps in Trainer.__init__ function to check metrics of validation set for certain steps.
Motivation
For many tasks, large models are trained in steps not complete epochs, especially pretrained models in CV and NLP. As a consequence, step-based arguments like max_steps, log_every_n_steps may be more convenient than epoch-based ones. However, the Trainer API only has a check_val_every_n_epoch argument for computing metrics of validation data. It's very helpful to have an additional argument like check_val_every_n_steps in Trainer constructor.
Another confusing concept is batch_idx in training_step, validation_step and test_step. A detailed example or illustration may be helpful to understand this concept. From my experience, batch_idx may not be widely used for developing models.
@rohitgr7 val_check_interval can't exceed a single epoch. So it does not support evaluation every n steps where n is larger than the amount of batches in the dataloader.
🚀 Feature
Add an argument
check_val_every_n_steps
inTrainer.__init__
function to check metrics of validation set for certain steps.Motivation
For many tasks, large models are trained in steps not complete epochs, especially pretrained models in CV and NLP. As a consequence, step-based arguments like
max_steps
,log_every_n_steps
may be more convenient than epoch-based ones. However, theTrainer
API only has acheck_val_every_n_epoch
argument for computing metrics of validation data. It's very helpful to have an additional argument likecheck_val_every_n_steps
inTrainer
constructor.Pitch
Trainer.init(..., check_val_every_n_epoch=1, check_val_every_n_steps=100, ...)
The text was updated successfully, but these errors were encountered: