Fix squared optimization steps bug#280
Conversation
|
Hello! This bug was discovered #268 (comment), and a fix was proposed. It involves simply removing this line: Line 378 in f777c2c When steps_per_epoch is not specified, the number of steps per epoch will be equivalent to the number of datapoints in the dataloader. See the docs on the fit method for more info.
The remainder of the code would not need any changes.
|
|
Hey @tomaarsen, I removed the argument as you mentioned. Beyond that, this version also fixes the fact that for the triplet losses |
|
You're very right! Thanks for catching that. I'll merge this once the tests go green 🎉 |
src/setfit/trainer.py
Outdated
| logger.info(f" Total optimization steps = {len(train_dataloader) * num_epochs}") | ||
| logger.info(f" Total train batch size = {batch_size}") | ||
|
|
||
| warmup_steps = math.ceil(train_steps * self.warmup_proportion) |
There was a problem hiding this comment.
| warmup_steps = math.ceil(train_steps * self.warmup_proportion) | |
| warmup_steps = math.ceil(train_steps * self.warmup_proportion) |
This still uses the train_steps variable that you removed.
|
Wonderful! Thank you for this fix, and for the edits on the PR @twerkmeister! |
Based on the bug fix from huggingface#280
Based on the bug fix from #280
Currently, when increasing the number of epochs, the number of steps per epoch is also increased, leading to a number of optimization steps thats roughly
(examples / batch size) * epochs * epochsinstead of(examples / batch size) * epochs. The proposed change should fix this.