Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Role of internal hpc_load & hpc_save #4381

Closed
tarepan opened this issue Oct 26, 2020 · 1 comment
Closed

Role of internal hpc_load & hpc_save #4381

tarepan opened this issue Oct 26, 2020 · 1 comment
Labels
question Further information is requested

Comments

@tarepan
Copy link
Contributor

tarepan commented Oct 26, 2020

❓ Questions and Help

What is your question?

What is role/responsibility of hpc_load & hpc_save?
Is it same with that of restore?

Motivation

pl has two internal way of save/load, restore way & hpc_load/hpc_save way.
They do similar dump/loading, and has different checkpoint selection mechanism.
If these two method have same role/responsibility in the sense of dump/loading, we can refactor them with common dump/loading code.
These is already some disparity (#1947), it could be potential bug source.
The motivation of this question is understanding role/responsibility for refactoring.

What have you tried?

Survey public documents and internal codes.

hpc_save

No public API (search result in docs), only used in internal SLURMConnector (search result in repo)
https://github.com/PyTorchLightning/pytorch-lightning/blob/66e58f5afb6ae8702b29ada52f7b022bbf201f9e/pytorch_lightning/trainer/connectors/slurm_connector.py#L88

hpc_load

No public API (search result in docs), only used in internal CheckpointConnector.hpc_load (search result in repo)
https://github.com/PyTorchLightning/pytorch-lightning/blob/3abfec896212ea85e45d6ac3ccb323ef242d16de/pytorch_lightning/trainer/connectors/checkpoint_connector.py#L202

@tarepan tarepan added the question Further information is requested label Oct 26, 2020
@tarepan tarepan closed this as completed Nov 12, 2020
@tarepan
Copy link
Contributor Author

tarepan commented Jan 6, 2021

Now no disparity and almost all code is common between hpc_load & restore, common restore method is proposed #5370

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant