Skip to content

NumPy arrays serialize more slowly with cloudpickle than pickle #58

@mrocklin

Description

@mrocklin

I would expect pickle and cloudpickle to behave pretty much identically here. Sadly cloudpickle serializes much more slowly.

In [1]: import numpy as np

In [2]: data = np.random.randint(0, 255, dtype='u1', size=100000000)

In [3]: import cloudpickle, pickle

In [4]: %time len(pickle.dumps(data, protocol=pickle.HIGHEST_PROTOCOL))
CPU times: user 50.9 ms, sys: 135 ms, total: 186 ms
Wall time: 185 ms
Out[4]: 100000161

In [5]: %time len(cloudpickle.dumps(data, protocol=pickle.HIGHEST_PROTOCOL))
CPU times: user 125 ms, sys: 280 ms, total: 404 ms
Wall time: 405 ms
Out[5]: 100000161

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions