Replies: 1 comment
-
This code looks worked: def dask_worker2():
ddf = dd.read_csv(manifest_path)
ddf = ddf.repartition(npartitions=4)
def mff_wrapper(dfd):
df = dfd.compute()
return df.smiles.apply(make_fingerprint_feature)
futures = client.map(mff_wrapper, ddf.to_delayed())
results = client.gather(futures)
return results Is this a typical way to assign partitioned dataframe to distribued client? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Here is my trials:
Firstly I've tried to do the above using
dask_worker1
but realized this is an anti-pattern for large-rows dataframes.So I made another one as
dask_worker2
but it complainsIs there any good way to use numpy array as the return type?
Beta Was this translation helpful? Give feedback.
All reactions