-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to change "=" symbol in ModelCheckpoint filenames #17999
Add option to change "=" symbol in ModelCheckpoint filenames #17999
Conversation
Add attribute CHECKPOINT_EQUALS_CHAR to class ModelCheckpoint, which determines what character to use between the <key, value> pairs in the filename of the checkpoints. Before it was hard-coded to "=". Now it defaults to "=" but can be changed. Also fix a mistake in the documentation of that same class.
I had changed the formatting of the whole file by mistake, now it is the same as it was before my change
…ub.com/giorgioskij/lightning into feature/17825_checkpoint_equals_symbol
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition!
We would need a test too. I suggest to extend the existing test_model_checkpoint_format_checkpoint_name
in test_model_checkpoint.py.
Co-authored-by: Adrian Wälchli <[email protected]>
…ub.com/giorgioskij/lightning into feature/17825_checkpoint_equals_symbol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What does this PR do?
Part of #17825
This PR adds a feature, as discussed in issue #17825, to change the default "=" symbol between the name of a field and its value in the filenames generated by the ModelCheckpoint class. There was already an option to change the separator symbol in the filenames, so this follows the same exact implementation style.
Example:
now this trainer will save its checkpoints as:
epoch:1-step:1234.ckpt
instead ofepoch=1-step=1234.ckpt
While adding the new attribute to the docstring of the class ModelCheckpoint, we noticed a discrepancy between code and docstring. The docs say:
However, the correct way to change that attributes at the moment is by using the class name, and not the instance name, so:
ModelCheckpoint.CHECKPOINT_NAME_LAST = {epoch}-last
In issue #17825 we decided to keep this as is, and make a new pr to change the code to behave how it is currently documented.
I also updated CHANGELOG.md to mention this new attribute, including the link to this PR.
Thanks a lot for your time reviewing this!
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:
Reviewer checklist