Add warning on argparse gpus flag #7236

edenlightning · 2021-04-27T13:35:44Z

codecov · 2021-04-27T13:37:06Z

Codecov Report

Merging #7236 (efcd2d0) into master (f920ba2) will decrease coverage by 0%.
The diff coverage is n/a.

@@          Coverage Diff           @@
##           master   #7236   +/-   ##
======================================
- Coverage      91%     91%   -0%     
======================================
  Files         198     198           
  Lines       12723   12723           
======================================
- Hits        11632   11629    -3     
- Misses       1091    1094    +3

awaelchli · 2021-04-27T13:56:17Z

This does not address #6228. This argparser here in the example doesn't have a type, so the string is forwarded to the trainer as if you would set Trainer(gpus="3"). This selects the third GPU, not [1, 2, 3] and it is different form our Lightning argparser. See this example:

import os
from argparse import ArgumentParser

import torch
from torch.utils.data import Dataset, DataLoader
from pytorch_lightning import LightningModule, Trainer


class RandomDataset(Dataset):

    def __init__(self, size, length):
        self.len = length
        self.data = torch.randn(length, size)

    def __getitem__(self, index):
        return self.data[index]

    def __len__(self):
        return self.len


class BoringModel(LightningModule):

    def __init__(self):
        super().__init__()
        self.layer = torch.nn.Linear(32, 2)

    def forward(self, x):
        return self.layer(x)

    def training_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("train_loss", loss)
        return {"loss": loss}

    def validation_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("valid_loss", loss)

    def test_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("test_loss", loss)

    def configure_optimizers(self):
        return torch.optim.SGD(self.layer.parameters(), lr=0.1)


def run():
    train_data = DataLoader(RandomDataset(32, 64), batch_size=2)
    val_data = DataLoader(RandomDataset(32, 64), batch_size=2)
    test_data = DataLoader(RandomDataset(32, 64), batch_size=2)

    parser = ArgumentParser()
    parser.add_argument("--gpus", default=None)
    args = parser.parse_args()

    model = BoringModel()
    trainer = Trainer(
        default_root_dir=os.getcwd(),
        num_sanity_val_steps=0,
        gpus=args.gpus,
        weights_summary=None,
    )
    trainer.fit(model, train_dataloader=train_data, val_dataloaders=val_data)
    trainer.test(model, test_dataloaders=test_data)


if __name__ == '__main__':
    run()

As I explained in the issue and on the linked PR, the issue is not with any of the arparser logic we have. The origin of the confusion is the argument in the Trainer itself, when passing in a string: Trainer(gpus="3").
The confusion comes from the fact that Trainer(gpus="3") does not mean the same as Trainer(gpus=3)

tchaton · 2021-04-28T07:39:26Z

Hey @awaelchli,

Should we change this behaviour and convert "3" to map to [0, 1, 2] instead.
It will be a breaking change but we can add a warning that this behaviour will be enforced in 1.5 ?

Your thoughts ?

awaelchli · 2021-04-28T10:28:30Z

To fix the confusion yes, it would be the best option IMO. And it's what I was doing here in this PR #6388 already to resolve the issue, but got push back because it's a backward incompatible change and so we are not allowed to change it unfortunately.

tchaton · 2021-05-04T08:07:12Z

Hey @awaelchli, closing this PR then :)

Update trainer.rst

efcd2d0

edenlightning requested review from awaelchli, Borda and tchaton as code owners April 27, 2021 13:35

edenlightning changed the title ~~Add waring on argparse gpus flag~~ Add warning on argparse gpus flag Apr 27, 2021

kaushikb11 approved these changes Apr 27, 2021

View reviewed changes

awaelchli mentioned this pull request May 2, 2021

make gpus=str in Trainer consistent with command line parsing of string #6388

Merged

11 tasks

tchaton closed this May 4, 2021

Borda deleted the edenlightning-patch-1 branch May 10, 2021 09:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add warning on argparse gpus flag #7236

Add warning on argparse gpus flag #7236

edenlightning commented Apr 27, 2021

codecov bot commented Apr 27, 2021 •

edited

Loading

awaelchli commented Apr 27, 2021 •

edited

Loading

tchaton commented Apr 28, 2021 •

edited

Loading

awaelchli commented Apr 28, 2021 •

edited

Loading

tchaton commented May 4, 2021

Add warning on argparse gpus flag #7236

Add warning on argparse gpus flag #7236

Conversation

edenlightning commented Apr 27, 2021

codecov bot commented Apr 27, 2021 • edited Loading

Codecov Report

awaelchli commented Apr 27, 2021 • edited Loading

tchaton commented Apr 28, 2021 • edited Loading

awaelchli commented Apr 28, 2021 • edited Loading

tchaton commented May 4, 2021

codecov bot commented Apr 27, 2021 •

edited

Loading

awaelchli commented Apr 27, 2021 •

edited

Loading

tchaton commented Apr 28, 2021 •

edited

Loading

awaelchli commented Apr 28, 2021 •

edited

Loading