-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird bug in TabularDataset.column_names #589
Labels
Comments
product-auto-label
bot
added
the
api: aiplatform
Issues related to the AI Platform API.
label
Aug 3, 2021
Ark-kun
added a commit
to Ark-kun/python-aiplatform
that referenced
this issue
Aug 3, 2021
Fixes googleapis#589 The `end` parameter of the `blob.download_as_bytes` function is inclusive, not exclusive. > There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.
4 tasks
Ark-kun
added a commit
to Ark-kun/pipeline_components
that referenced
this issue
Aug 3, 2021
…component There is a hack to work around the issue googleapis/python-aiplatform#589 that I fixed in googleapis/python-aiplatform#590
Ark-kun
added a commit
to Ark-kun/pipeline_components
that referenced
this issue
Aug 3, 2021
…component There is a hack to work around the issue googleapis/python-aiplatform#589 that I fixed in googleapis/python-aiplatform#590
sasha-gitg
pushed a commit
that referenced
this issue
Aug 4, 2021
Fixes #589 The `end` parameter of the `blob.download_as_bytes` function is inclusive, not exclusive. > There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors. Co-authored-by: gcf-merge-on-green[bot] <60162190+gcf-merge-on-green[bot]@users.noreply.github.com>
Ark-kun
added a commit
to Ark-kun/pipeline_components
that referenced
this issue
Oct 25, 2021
…component There is a hack to work around the issue googleapis/python-aiplatform#589 that I fixed in googleapis/python-aiplatform#590
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
I got very weird issue.
I've imported well-known Retail Stockout prediction dataset in CSV format. I've imported the dataset to the Vertex AI Datasets using google.cloud.aiplatform.TabularDataset python library code.
Most columns have the "Wk_" prefix. The screenshot shows that there is only one column with "2016_43_Quantity" in it - "Wk_2016_43_Quantity" column. Just like in the source CSV.
Everything is fine.
But here is the problem:
When I call the API to get the dataset metadata including the column names, all column names are fine except one which is stated as "WWk_2016_43_Quantity". (Notice the double "W" in the "WWk_" prefix).
In context:
...
'Wk_2016_42_Quantity',
'WWk_2016_43_Quantity',
'Wk_2016_44_Quantity',
...
This discrepancy causes the subsequent model training to fail due to the dataset not having the
WWk_2016_43_Quantity
column (it hasWk_2016_43_Quantity
instead).I do not understand how this could have happened, but you can easily examine the imported dataset and see that the UX and and what returned by the google-cloud-aiplatform library differs.
Environment details
google-cloud-aiplatform
version: 1.1.1Steps to reproduce
Code example
The text was updated successfully, but these errors were encountered: