-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The generate_statistics_from_csv very slowly for large dataset in single server #98
Comments
Another option is to try using
|
Hi Yajunwang,
When executing your pipeline locally, the default values for the properties
in PipelineOptions are generally sufficient and direct runner on one
compute.
https://cloud.google.com/dataflow/docs/guides/specifying-exec-params
…On Sat, Jan 4, 2020, 04:18 Paul Suganthan ***@***.***> wrote:
Another option is to try using generate_statistics_from_dataframe if you
can load your dataset as a pandas dataframe.
import tensorflow_data_validation as tfdv
import pandas as pd
CSV_FILE_PATH = ''
df = pd.read_csv(CSV_FILE_PATH)
stats = tfdv.generate_statistics_from_dataframe(df)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#98?email_source=notifications&email_token=AEYAML5PHJXRYIRFD6X3GCTQ36TUHA5CNFSM4KCR5WI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEICDLOQ#issuecomment-570701242>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEYAMLYSHUNOI27MX5D4ZNTQ36TUHANCNFSM4KCR5WIQ>
.
|
It's seem not invalid for this option! Please infer this gist https://gist.github.com/yajunwong/f317c565f375125fd3ec2963967ba164 |
I try to this api, but report error, please refer this issue: #98 (comment) |
Hi According to the tfx examples, I pass the
pipeline_options
togenerate_statistics_from_csv
which set--direct_num_workers=16
like:It's seem that this option cannot speed up this API, when I set
direct_num_workers=1
, the cost time is equal the 16 worker, like that:Could someone help me?
The text was updated successfully, but these errors were encountered: