TypeError: expected str, bytes or os.PathLike object, not NoneType #3898

rotabulo · 2020-10-06T10:39:25Z

🐛 Bug

I am summarizing the source of the issue to speedup the fix.
After this line of code
https://github.com/PyTorchLightning/pytorch-lightning/blob/90929fa4333e5136020e9f9dcb7c1133e4c290f3/pytorch_lightning/accelerators/ddp_backend.py#L119

I have that env_copy['PL_GLOBAL_SEED'] is None and having an environment variable set to None breaks subprocess.Popen here
https://github.com/PyTorchLightning/pytorch-lightning/blob/90929fa4333e5136020e9f9dcb7c1133e4c290f3/pytorch_lightning/accelerators/ddp_backend.py#L127

My fix at the moment is to add

if env_copy['PL_GLOBAL_SEED'] is None:
                del env_copy['PL_GLOBAL_SEED']

after

https://github.com/PyTorchLightning/pytorch-lightning/blob/90929fa4333e5136020e9f9dcb7c1133e4c290f3/pytorch_lightning/accelerators/ddp_backend.py#L119

Environment

* CUDA:
	- GPU:
	- available:         False
	- version:           10.2
* Packages:
	- numpy:             1.18.5
	- pyTorch_debug:     False
	- pyTorch_version:   1.6.0
	- pytorch-lightning: 0.10.0rc1
	- tqdm:              4.48.0
* System:
	- OS:                Linux
	- architecture:
		- 64bit
		- 
	- processor:         x86_64
	- python:            3.7.7
	- version:           #100-Ubuntu SMP Wed Apr 22 20:32:56 UTC 2020

The text was updated successfully, but these errors were encountered:

SeanNaren · 2020-10-06T13:53:17Z

Hey @rotabulo thanks for the report, could you describe the case/give a code example where PL_GLOBAL_SEED is None at this step?

sidml · 2021-07-09T00:39:05Z

I got the same error while training. I am using pytorch_lightning:1.3.8, torch: 1.9.0+cu102.
It seems to happen at random, so i am not sure how to reproduce it.

Here's the pytorch lightening error message

Traceback (most recent call last):
File "train.py", line 203, in
trainer.fit(model)
File "/home/sid/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 460, in fit
self._run(model)
File "/home/sid/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 758, in _run
self.dispatch()
File "/home/sid/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 799, in dispatch
self.accelerator.start_training(self)
File "/home/sid/miniconda3/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 96, in start_training
self.training_type_plugin.start_training(trainer)
File "/home/sid/miniconda3/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 144, in start_training
self._results = trainer.run_stage()
File "/home/sid/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 809, in run_stage
return self.run_train()
File "/home/sid/miniconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 909, in run_train
self.training_type_plugin.reconciliate_processes(traceback.format_exc())
File "/home/sid/miniconda3/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/ddp.py", line 383, in reconciliate_processes
torch.save(True, os.path.join(sync_dir, f"{self.global_rank}.pl"))
File "/home/sid/miniconda3/lib/python3.8/posixpath.py", line 76, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

bw4sz · 2021-07-22T16:46:53Z

Can we re-open here or get some guidance how we can make reproducible? I see this at random (re-run does not produce it) on DDP on slurm. Too many workers?

rotabulo added bug Something isn't working help wanted Open to be worked on labels Oct 6, 2020

SeanNaren self-assigned this Oct 6, 2020

SeanNaren mentioned this issue Oct 6, 2020

Fix global seed missing bug with DDP [WIP] #3904

Merged

williamFalcon closed this as completed in #3904 Oct 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError: expected str, bytes or os.PathLike object, not NoneType #3898

TypeError: expected str, bytes or os.PathLike object, not NoneType #3898

rotabulo commented Oct 6, 2020

SeanNaren commented Oct 6, 2020

sidml commented Jul 9, 2021

bw4sz commented Jul 22, 2021

TypeError: expected str, bytes or os.PathLike object, not NoneType #3898

TypeError: expected str, bytes or os.PathLike object, not NoneType #3898

Comments

rotabulo commented Oct 6, 2020

🐛 Bug

Environment

SeanNaren commented Oct 6, 2020

sidml commented Jul 9, 2021

bw4sz commented Jul 22, 2021