Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WandB logger and LightningCLI save_config_callback clashes #7679

Closed
oplatek opened this issue May 24, 2021 · 2 comments · Fixed by #7741
Closed

WandB logger and LightningCLI save_config_callback clashes #7679

oplatek opened this issue May 24, 2021 · 2 comments · Fixed by #7741
Labels
argparse (removed) Related to argument parsing (argparse, Hydra, ...) bug Something isn't working help wanted Open to be worked on
Milestone

Comments

@oplatek
Copy link
Contributor

oplatek commented May 24, 2021

🐛 Bug: WandB logger and LightningCLI save_config_callback clashes

My fork documents LightningCLI.save_config_callback default filename clash
with WandB logger on autoencoder PL basic example .

See run-example-wandb-save-config-callback-clash-workaournd.sh, autoencoder.{py,yml}
I used the main environment.yml to document the dependencies.

The example below reproduces the error by using the default LightningCLI and WandB logger.
However, if you just change the single line in basic_examples/autoencoder.yml

# save_config_callback_filename: ''   # Keeps the LightningCLI intact - do not apply workaround - reproduce the error
save_config_callback_filename: 'another-name-config.yaml'   # sets different filename in save_config_callback

It runs fine.

If you do not change anything you will be able to reproduce the error.

$ ./run-example-wandb-save-config-callback-clash-workaournd.sh   # inside the pytorch-lightning/pl_examples directory
...
Traceback (most recent call last):
    self.parser.save(self.config, config_path, skip_none=False)
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/jsonargparse/core.py", line 811, in save
  File "basic_examples/autoencoder.py", line 139, in <module>
    cli_main()
  File "basic_examples/autoencoder.py", line 132, in cli_main
    cli = WandBandSafeConfigCallBackFixCLI(LitAutoEncoder, MyDataModule, seed_everything_default=1234)
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/utilities/cli.py", line 173, in __init__
    self.fit()
    check_overwrite(path_fc)
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/jsonargparse/core.py", line 808, in check_overwrite
  File ️"/lnwt/wojk/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/utilities/cli.py", line 256, in fit
    self.trainer.fit(**self.fit_kwargs)
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 458, in fit
    self._run(model)
    raise ValueError('Refusing to overwrite existing file: '+path())
ValueError: Refusing to overwrite existing file: /lnet/work/people/oplatek/pytorch-lightning/pl_examples/wandb/run-20210524_145129-1rtepmq4/files/config.yaml
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 756, in _run
    self.dispatch()
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 797, in dispatch
    self.accelerator.start_training(self)
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 96, in start_training
    self.training_type_plugin.start_training(trainer)
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 144, in start_training
    self._results = trainer.run_stage()
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 807, in run_stage
    return self.run_train()
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 855, in run_train
    self.train_loop.on_train_start()
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 101, in on_train_start
    self.trainer.call_hook("on_train_start")
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1223, in call_hook
    trainer_hook(*args, **kwargs)
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/trainer/callback_hook.py", line 152, in on_train_start
    callback.on_train_start(self, self.lightning_module)
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/pytorch_lightning/utilities/cli.py", line 87, in on_train_start
    self.parser.save(self.config, config_path, skip_none=False)
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/jsonargparse/core.py", line 811, in save
    check_overwrite(path_fc)
  File "/lnet/work/people/oplatek/pytorch-lightning/env/lib/python3.8/site-packages/jsonargparse/core.py", line 808, in check_overwrite
    raise ValueError('Refusing to overwrite existing file: '+path())
ValueError: Refusing to overwrite existing file: /lnet/work/people/oplatek/pytorch-lightning/pl_examples/wandb/run-20210524_145129-1rtepmq4/files/config.yaml

wandb: Waiting for W&B process to finish, PID 8111
wandb: Program failed with code 1.  Press ctrl-c to abort syncing.
wandb:                                                                                
wandb: Find user logs for this run at: /lnet/work/people/oplatek/pytorch-lightning/pl_examples/wandb/run-20210524_145129-1rtepmq4/logs/debug.log
wandb: Find internal logs for this run at: /lnet/work/people/oplatek/pytorch-lightning/pl_examples/wandb/run-20210524_145129-1rtepmq4/logs/debug-internal.log
wandb: Synced 7 W&B file(s), 0 media file(s), 0 artifact file(s) and 1 other file(s)

Expected behavior

It would be useful to at least pass the save_config_callback filename argument to the LightningCLI constructor so a user could actually change the filename when creating a LightningCLI object.
The instantiation of SaveConfigCallback needs to be fixed

Environment

Please see the environment.yml

@oplatek oplatek added bug Something isn't working help wanted Open to be worked on labels May 24, 2021
@carmocca carmocca added the argparse (removed) Related to argument parsing (argparse, Hydra, ...) label May 24, 2021
@carmocca carmocca added this to the v1.3.x milestone May 24, 2021
@mauvilsa
Copy link
Contributor

It would be useful to at least pass the save_config_callback filename argument to the LightningCLI constructor so a user could actually change the filename when creating a LightningCLI object.

I agree. This is a simple change. I can create a pull request, unless you @oplatek want to contribute this.

@oplatek
Copy link
Contributor Author

oplatek commented May 27, 2021

I am quite busy these days ... I would be happy if you do it @mauvilsa. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
argparse (removed) Related to argument parsing (argparse, Hydra, ...) bug Something isn't working help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants