Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

IOError: [Errno 32] Broken pipe in Windows version #10562

Closed
Feywell opened this issue Apr 16, 2018 · 11 comments
Closed

IOError: [Errno 32] Broken pipe in Windows version #10562

Feywell opened this issue Apr 16, 2018 · 11 comments

Comments

@Feywell
Copy link

Feywell commented Apr 16, 2018

Description

I use mx.gluon.data.DataLoader will meet this trouble:

File "E:\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)

File "E:\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 87, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)

File "stn_gluon.py", line 147, in
train(epoch)

File "stn_gluon.py", line 108, in train
for data, label in train_data:

File "E:\Anaconda2\lib\site-packages\mxnet\gluon\data\dataloader.py", line 284, in iter
self._batchify_fn, self._batch_sampler)

File "E:\Anaconda2\lib\site-packages\mxnet\gluon\data\dataloader.py", line 144, in init
worker.start()

File "E:\Anaconda2\lib\multiprocessing\process.py", line 130, in start
self._popen = Popen(self)

File "E:\Anaconda2\lib\multiprocessing\forking.py", line 277, in init
dump(process_obj, to_child, HIGHEST_PROTOCOL)

File "E:\Anaconda2\lib\multiprocessing\forking.py", line 199, in dump
ForkingPickler(file, protocol).dump(obj)

File "E:\Anaconda2\lib\pickle.py", line 224, in dump
self.save(obj)

File "E:\Anaconda2\lib\pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)

File "E:\Anaconda2\lib\pickle.py", line 425, in save_reduce
save(state)

File "E:\Anaconda2\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self

File "E:\Anaconda2\lib\pickle.py", line 655, in save_dict
self._batch_setitems(obj.iteritems())

File "E:\Anaconda2\lib\pickle.py", line 687, in _batch_setitems
save(v)

File "E:\Anaconda2\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self

File "E:\Anaconda2\lib\pickle.py", line 568, in save_tuple
save(element)

File "E:\Anaconda2\lib\pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)

File "E:\Anaconda2\lib\pickle.py", line 425, in save_reduce
save(state)

File "E:\Anaconda2\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self

File "E:\Anaconda2\lib\pickle.py", line 655, in save_dict
self._batch_setitems(obj.iteritems())

File "E:\Anaconda2\lib\pickle.py", line 687, in _batch_setitems
save(v)

File "E:\Anaconda2\lib\pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)

File "E:\Anaconda2\lib\pickle.py", line 425, in save_reduce
save(state)

File "E:\Anaconda2\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self

File "E:\Anaconda2\lib\pickle.py", line 568, in save_tuple
save(element)

File "E:\Anaconda2\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self

File "E:\Anaconda2\lib\pickle.py", line 492, in save_string
self.write(BINSTRING + pack("<i", n) + obj)

IOError: [Errno 32] Broken pipe

Code is here:

`train_data = DataLoader(
        vision.datasets.MNIST(train=True, 
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ])),batch_size=2, shuffle=True, num_workers=1
              )

test_data = DataLoader(
        vision.datasets.MNIST(train=False,
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ])),batch_size=2, shuffle=False, num_workers=1
              )`

throw a trouble at

for data, label in train_data:

Environment info (Required)

----------Python Info----------
('Version :', '2.7.13')
('Compiler :', 'MSC v.1500 64 bit (AMD64)')
('Build :', ('default', 'May 11 2017 13:17:26'))
('Arch :', ('64bit', 'WindowsPE'))
------------Pip Info-----------
('Version :', '9.0.1')
('Directory :', 'E:\Anaconda2\lib\site-packages\pip')

----------MXNet Info-----------
E:\Anaconda2\lib\site-packages\h5py_init_.py:34: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
E:\Anaconda2\lib\site-packages\mxnet\optimizer.py:136: UserWarning: WARNING: New optimizer mxnet.optimizer.NAG is overriding existing optimizer mxnet.optimizer.NAG
Optimizer.opt_registry[name].name))
('Version :', '1.1.0')
('Directory :', 'E:\Anaconda2\lib\site-packages\mxnet')

Package used (Python/R/Scala/Julia):
Python 2.7

@rajanksin
Copy link
Contributor

@cjolivier01 : Please label : Windows, Operator

@wgting96
Copy link

wgting96 commented Jul 3, 2018

same issue.

It seems that MXNet doesn't support num_workers parameter in Windows currently, see the document here.

I set num_workers=0 and everything works well.

@aaronmarkham
Copy link
Contributor

@Feywell Did this resolve the issue? Or should the docs be updated to talk about this limitation?

@Ishitori
Copy link
Contributor

Yes, MXNet doesn't support num_workers on Windows, because fork() is not supported.
I would say that we need to update docs at least, if no one is willing to fix that issue.

@Ishitori
Copy link
Contributor

The fix of the multiple workers issue for Windows is here - #13686

@aaronmarkham
Copy link
Contributor

Closing this since the fix is in.

@antonmilev
Copy link

I am with mxnet 1.5.0 and have this problem..

@harplife
Copy link

Windows 10 Enterprise v 20H2, Python 3.6, Cuda 10, mxnet 1.7.0 - I have this problem too. It does work with num_workers=0.

>>> train_dataloader = mx.gluon.data.DataLoader(data_train, batch_size=batch_size, num_workers=5)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\zion\Anaconda3\envs\glu_env\lib\site-packages\mxnet\gluon\data\dataloader.py", line 620, in __init__
    initargs=[self._dataset, is_np_shape(), is_np_array()])
  File "C:\Users\zion\Anaconda3\envs\glu_env\lib\multiprocessing\context.py", line 119, in Pool
    context=self.get_context())
  File "C:\Users\zion\Anaconda3\envs\glu_env\lib\multiprocessing\pool.py", line 174, in __init__
    self._repopulate_pool()
  File "C:\Users\zion\Anaconda3\envs\glu_env\lib\multiprocessing\pool.py", line 239, in _repopulate_pool
    w.start()
  File "C:\Users\zion\Anaconda3\envs\glu_env\lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)
  File "C:\Users\zion\Anaconda3\envs\glu_env\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\zion\Anaconda3\envs\glu_env\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\zion\Anaconda3\envs\glu_env\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

@tcfkaj
Copy link

tcfkaj commented Aug 27, 2021

@aaronmarkham @Ishitori Also having this problem on 1.5 and others clearly still are on later versions. Can anyone confirm that this this is actually fixed on >1.4. Should I create another issue or will someone reopen this one since the fix is clearly not in?

@szha
Copy link
Member

szha commented Aug 27, 2021

I'd recommend opening a new issue if it can be triggered on the latest 1.9 version

@tcfkaj
Copy link

tcfkaj commented Aug 27, 2021

@szha So, I found a work around. It appears that there has been similar issues with multiprocessing on Windows in pytorch and pymc3.

The data-loader must initialized inside of if __name__=="__main__" otherwise it fails. It also fails in .pynb. Not sure if this is fixed in later versions or not. But the workaround is sufficient for my needs to keep me from building a later version on Windows.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants