You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is role/responsibility of hpc_load & hpc_save?
Is it same with that of restore?
Motivation
pl has two internal way of save/load, restore way & hpc_load/hpc_save way.
They do similar dump/loading, and has different checkpoint selection mechanism.
If these two method have same role/responsibility in the sense of dump/loading, we can refactor them with common dump/loading code.
These is already some disparity (#1947), it could be potential bug source.
The motivation of this question is understanding role/responsibility for refactoring.
❓ Questions and Help
What is your question?
What is role/responsibility of
hpc_load
&hpc_save
?Is it same with that of
restore
?Motivation
pl has two internal way of save/load,
restore
way &hpc_load
/hpc_save
way.They do similar dump/loading, and has different checkpoint selection mechanism.
If these two method have same role/responsibility in the sense of dump/loading, we can refactor them with common dump/loading code.
These is already some disparity (#1947), it could be potential bug source.
The motivation of this question is understanding role/responsibility for refactoring.
What have you tried?
Survey public documents and internal codes.
hpc_save
No public API (search result in docs), only used in internal
SLURMConnector
(search result in repo)https://github.com/PyTorchLightning/pytorch-lightning/blob/66e58f5afb6ae8702b29ada52f7b022bbf201f9e/pytorch_lightning/trainer/connectors/slurm_connector.py#L88
hpc_load
No public API (search result in docs), only used in internal
CheckpointConnector.hpc_load
(search result in repo)https://github.com/PyTorchLightning/pytorch-lightning/blob/3abfec896212ea85e45d6ac3ccb323ef242d16de/pytorch_lightning/trainer/connectors/checkpoint_connector.py#L202
The text was updated successfully, but these errors were encountered: