Wrong log_dir when use SaveConfigCallback + Wandb Logger #11256

gau-nernst · 2021-12-25T04:32:45Z

🐛 Bug

When I use Lightning CLI with Wandb Logger, SaveConfigCallback saves config file to the root directory instead of the wandb logging directory wandb/run-xxxx-xxxx/files/

|-- bug.py
|-- bug.yaml
|-- wandb/
   |-- run-xxxx-xxxx/
      |-- files/
         |-- ... → expect the config file to be saved here
|-- ... → but the config file is saved here

Without using Wanb Logger (default to TensorBoard), the config is correctly saved under lightning_logs/version_xx/. So it seems like there is a strange interaction between SaveConfigCallback and WandbLogger

To Reproduce

# bug.yaml
import torch
from torch.utils.data import DataLoader, Dataset

from pytorch_lightning import LightningModule
from pytorch_lightning.utilities.cli import LightningCLI


class RandomDataset(Dataset):
    def __init__(self, size, length):
        self.len = length
        self.data = torch.randn(length, size)

    def __getitem__(self, index):
        return self.data[index]

    def __len__(self):
        return self.len


class BoringModel(LightningModule):
    def __init__(self):
        super().__init__()
        self.layer = torch.nn.Linear(32, 2)

    def forward(self, x):
        return self.layer(x)

    def train_dataloader(self):
        return DataLoader(RandomDataset(32, 64), batch_size=2)

    def val_dataloader(self):
        return DataLoader(RandomDataset(32, 64), batch_size=2)

    def training_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("train_loss", loss)
        return {"loss": loss}

    def validation_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("valid_loss", loss)

    def test_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("test_loss", loss)

    def configure_optimizers(self):
        return torch.optim.SGD(self.layer.parameters(), lr=0.1)


if __name__ == "__main__":
    cli = LightningCLI(BoringModel)

# bug.yaml
trainer:
  max_epochs: 10

  logger:
    - class_path: pytorch_lightning.loggers.WandbLogger
      init_args:
        project: bug
        log_model: true

Run from command line

python bug.py fit --config bug.yaml

Expected behavior

The config file should be saved under the wandb logging directory wandb/run-xxxx-xxxx/files/

Environment

PyTorch Lightning Version (e.g., 1.5.0): 1.5.5
PyTorch Version (e.g., 1.10): 1.10.0
Python version (e.g., 3.9): 3.8
OS (e.g., Linux): Ubuntu 18.04
CUDA/cuDNN version: 11.3
GPU models and configuration: 4x RTX 3090 (GPUs are not involved, so not relevant)
How you installed PyTorch (conda, pip, source): conda
If compiling from source, the output of torch.__config__.show():
Any other relevant information: wandb 0.12.9

Additional context

I saw this issue #7679. It seems like the config was saved correctly to wandb logging directory before (thus the name clash), but now it is not saved there anymore.

cc @tchaton @rohitgr7 @awaelchli @morganmcg1 @AyushExel @borisdayma @scottire @carmocca @mauvilsa

The text was updated successfully, but these errors were encountered:

morganmcg1 · 2021-12-27T18:42:13Z

@borisdayma any idea on this?

borisdayma · 2021-12-27T21:55:51Z

I'm not sure it should be save under that directory and should maybe be handled separately, like we do with the models. This way there is a standard location where files are saved which is independent from wandb.
Also it is hard to make it reliable if you use multiple loggers.

We could either log it as an artifact or a file when wandb callback is active. Also not sure if the config should be an input or output artifact (use_artifact vs log_artifact).

Then we just have to think how to handle it, within WandbCallback or within SaveConfigCallback which would then make a call to the logger (a bit like we do with ModelCheckpoint).

mauvilsa · 2021-12-28T07:08:34Z

Currently SaveConfigCallback saves the config to the trainer.log_dir directory. There is no logger specific logic. From the description it seems that trainer.log_dir points to the root folder. I would expect the config to be saved in lightning_logs/version_xx/, not in the root and not in wandb/run-xxxx-xxxx/files/. There was a pull request to standardize the log_dir for all loggers (#7543). Not sure why it was closed but could be related to this.

Saving the config in a logger specific way should be a feature request. The bug here would be that the config is saved in the root instead of lightning_logs/version_xx/.

Only SaveConfigCallback is able to save the config because it is instantiated by the CLI with the required data. By the way, LightningCLI is designed so that a custom SaveConfigCallback class can be used. Just subclass it and save the config in any directory and/or save it as an artifact.

gau-nernst added the bug Something isn't working label Dec 25, 2021

akihironitta added the lightningcli pl.cli.LightningCLI label Dec 26, 2021

carmocca added the logger: wandb Weights & Biases label Dec 26, 2021

tchaton added priority: 1 Medium priority task and removed bug Something isn't working labels Jan 4, 2022

carmocca closed this as completed Apr 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong log_dir when use SaveConfigCallback + Wandb Logger #11256

Wrong log_dir when use SaveConfigCallback + Wandb Logger #11256

gau-nernst commented Dec 25, 2021 •

edited by github-actions bot

Loading

morganmcg1 commented Dec 27, 2021

borisdayma commented Dec 27, 2021

mauvilsa commented Dec 28, 2021

Wrong log_dir when use SaveConfigCallback + Wandb Logger #11256

Wrong log_dir when use SaveConfigCallback + Wandb Logger #11256

Comments

gau-nernst commented Dec 25, 2021 • edited by github-actions bot Loading

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

morganmcg1 commented Dec 27, 2021

borisdayma commented Dec 27, 2021

mauvilsa commented Dec 28, 2021

gau-nernst commented Dec 25, 2021 •

edited by github-actions bot

Loading