Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support spawn multiprocessing context #18

Closed
vwxyzjn opened this issue Jan 6, 2021 · 6 comments
Closed

Support spawn multiprocessing context #18

vwxyzjn opened this issue Jan 6, 2021 · 6 comments

Comments

@vwxyzjn
Copy link

vwxyzjn commented Jan 6, 2021

It appears the Queue only supports the "fork" starting method for multiprocessing. Had I used "spawn" context, I would get the following error.

Traceback (most recent call last):

  File "/home/costa/Documents/work/go/src/github.com/vwxyzjn/cleanrl/cleanrl/experiments/multiprocessing_cuda.py", line 23, in <module>
    for p in procs: p.start()

  File "/home/costa/anaconda3/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)

  File "/home/costa/anaconda3/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
    return Popen(process_obj)

  File "/home/costa/anaconda3/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)

  File "/home/costa/anaconda3/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)

  File "/home/costa/anaconda3/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)

  File "/home/costa/anaconda3/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)

PicklingError: Can't pickle <class '__main__.c_ubyte_Array_2'>: attribute lookup c_ubyte_Array_2 on __main__ failed

Example script below:

import torch
from torch import multiprocessing as mp
from faster_fifo import Queue as FastQueue

def producer(data_q):
    while True:
        data = [1,2,3]
        data_q.put(data)
 
def learner(data_q):
    while True:
        data = data_q.get()
        print(data)


if __name__ == '__main__':
    ctx = mp.get_context("spawn")
    data_q = FastQueue(1)
    procs = [
        ctx.Process(target=producer, args=(data_q,)) for _ in range(2)
    ]
    procs.append(ctx.Process(target=learner, args=(data_q,)))
    for p in procs: p.start()
    for p in procs: p.join()

Was wondering if there is any quick fix to this? Thanks.

@alex-petrenko
Copy link
Owner

Thank you for reporting. Haven't tried 'spawn' context before.
Might be related to pytorch/pytorch#31571

@alex-petrenko
Copy link
Owner

I think I might have fixed it. Just a weird issue with the pickler, one of the fields of the Queue object could not be pickled. I changed the code slightly and it works now.

Please clone the latest version and install manually with https://github.com/alex-petrenko/faster-fifo#manual-build-instructions
We'll update the pip package shortly

@alex-petrenko
Copy link
Owner

@vwxyzjn I believe this should be fixed now in the last version in pip

@Xudong-Huang
Copy link

I still got the following error when using multiprocessing

Can't pickle <class '__main__.c_ubyte_Array_10'>: attribute lookup c_ubyte_Array_10 on __main__ failed

@alex-petrenko
Copy link
Owner

@Xudong-Huang sorry to hear that! Are you building yours from sources or installing with pip?
@tushartk can you please take a look?

@alex-petrenko
Copy link
Owner

@Xudong-Huang sorry it took so long. I got to it and found what the problem was and why I wasn't able to reproduce it. It's pretty stupid in hindsight because the error is very telling. For some reason multiprocessing.ForkingPickler is not able to pickle the ubyte array that we use to receive messages. I wasn't able to reproduce this because this array is allocated only when we receive the first message. So if we create an empty queue it will be pickled just fine, but a queue that received at least one message will throw this error.

The solution was to setup custom Pickle handlers to work around this issue. To make it work with 'spawn' method you need to import faster_fifo_reduction to install the custom pickler. This is available in pip version 1.2.0

A later version can be further streamlined by wrapping the pyx class in a python module which will install these things automatically. It should be okay for now.

Thank you so much for helping debug this!

@alex-petrenko alex-petrenko pinned this issue May 18, 2021
kinman0224 pushed a commit to kinman0224/faster-fifo that referenced this issue Nov 8, 2021
Refactored tests to make them work
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants