Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wildcard path(s) for the GCS tabular dataset are not working #957

Closed
glebrh opened this issue Jan 17, 2022 · 3 comments · Fixed by #1220
Closed

Wildcard path(s) for the GCS tabular dataset are not working #957

glebrh opened this issue Jan 17, 2022 · 3 comments · Fixed by #1220
Assignees
Labels
aiplatform Issues related to the AI Platform (Unified) API. api: vertex-ai Issues related to the googleapis/python-aiplatform API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@glebrh
Copy link

glebrh commented Jan 17, 2022

Dear team,

It seems that despite being mentioned in the docstring and documentation of the TabularDataset, wildcard paths for the tabular GCS datasets are not supported. This is related both to the metadata parsing (e.g. viewing column names) as well as model training using this dataset

See the examples below

Environment details

  • Windows 10
  • Python version: 3.8.12
  • google-cloud-aiplatform version:1.8.1

Steps to reproduce

  1. Create a dataset with wildcard path pointing to the GCS bucket with csv files
  2. Try to view column names of such a dataset
  3. Try to run a custom training job using the dataset

Code example

ds = aiplatform.TabularDataset.create(
    display_name="tabular-dataset-test", gcs_source='gs://<bucketname>/<path-to-files>/*', sync=True
)

print(ds.column_names)

job = aiplatform.CustomTrainingJob(...)
model = job.run(ds, sync=True)

Stack trace

~\.conda\envs\vertex-sample\lib\site-packages\google\cloud\aiplatform\training_jobs.py in run(self, dataset, annotation_schema_uri, model_display_name, model_labels, base_output_dir, service_account, network, bigquery_destination, args, environment_variables, replica_count, machine_type, accelerator_type, accelerator_count, boot_disk_type, boot_disk_size_gb, reduction_server_replica_count, reduction_server_machine_type, reduction_server_container_uri, training_fraction_split, validation_fraction_split, test_fraction_split, training_filter_split, validation_filter_split, test_filter_split, predefined_split_column_name, timestamp_split_column_name, enable_web_access, tensorboard, sync)
   2066         )
   2067 
-> 2068         return self._run(
   2069             python_packager=python_packager,
   2070             dataset=dataset,

~\.conda\envs\vertex-sample\lib\site-packages\google\cloud\aiplatform\base.py in wrapper(*args, **kwargs)
    673                 if self:
    674                     VertexAiResourceNounWithFutureManager.wait(self)
--> 675                 return method(*args, **kwargs)
    676 
    677             # callbacks to call within the Future (in same Thread)

~\.conda\envs\vertex-sample\lib\site-packages\google\cloud\aiplatform\training_jobs.py in _run(self, python_packager, dataset, annotation_schema_uri, worker_pool_specs, managed_model, args, environment_variables, base_output_dir, service_account, network, bigquery_destination, training_fraction_split, validation_fraction_split, test_fraction_split, training_filter_split, validation_filter_split, test_filter_split, predefined_split_column_name, timestamp_split_column_name, enable_web_access, tensorboard, reduction_server_container_uri, sync)
   2319         )
   2320 
-> 2321         model = self._run_job(
   2322             training_task_definition=schema.training_job.definition.custom_task,
   2323             training_task_inputs=training_task_inputs,

~\.conda\envs\vertex-sample\lib\site-packages\google\cloud\aiplatform\training_jobs.py in _run_job(self, training_task_definition, training_task_inputs, dataset, training_fraction_split, validation_fraction_split, test_fraction_split, training_filter_split, validation_filter_split, test_filter_split, predefined_split_column_name, timestamp_split_column_name, annotation_schema_uri, model, gcs_destination_uri_prefix, bigquery_destination)
    749         _LOGGER.info("View Training:\n%s" % self._dashboard_uri())
    750 
--> 751         model = self._get_model()
    752 
    753         if model is None:

~\.conda\envs\vertex-sample\lib\site-packages\google\cloud\aiplatform\training_jobs.py in _get_model(self)
    836             RuntimeError: If Training failed.
    837         """
--> 838         self._block_until_complete()
    839 
    840         if self.has_failed:

~\.conda\envs\vertex-sample\lib\site-packages\google\cloud\aiplatform\training_jobs.py in _block_until_complete(self)
    886             time.sleep(wait)
    887 
--> 888         self._raise_failure()
    889 
    890         _LOGGER.log_action_completed_against_resource("run", "completed", self)

~\.conda\envs\vertex-sample\lib\site-packages\google\cloud\aiplatform\training_jobs.py in _raise_failure(self)
    903 
    904         if self._gca_resource.error.code != code_pb2.OK:
--> 905             raise RuntimeError("Training failed with:\n%s" % self._gca_resource.error)
    906 
    907     @property

RuntimeError: Training failed with:
code: 5
message: "Google Cloud Storage file(s) not found: [gs://<bucketname>/<path-to-files>/*]"
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Jan 18, 2022
@busunkim96 busunkim96 added api: aiplatform Issues related to the AI Platform API. aiplatform Issues related to the AI Platform (Unified) API. and removed triage me I really want to be triaged. labels Jan 18, 2022
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Jan 18, 2022
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Jan 22, 2022
@morgandu morgandu assigned ivanmkc and unassigned sasha-gitg Jan 24, 2022
@kweinmeister kweinmeister added priority: p2 Moderately-important priority. Fix may not be included in next release. and removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Jan 26, 2022
@ivanmkc ivanmkc added the type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. label Jan 27, 2022
@ivanmkc
Copy link
Contributor

ivanmkc commented Jan 27, 2022

Taking a look at this.

@ivanmkc
Copy link
Contributor

ivanmkc commented Jan 28, 2022

After some testing, I believe that wildcards are not supported. Please enumerate the files you need beforehand and pass them in without wildcards.

Thanks for bringing this to our attention. We'll correct the documentation to reflect this.

@ivanmkc ivanmkc closed this as completed Jan 28, 2022
@kawofong
Copy link

When can we expect the documentation to be updated? I recently ran into this issue and wasted hours of debugging before realizing that this is a known problem with the documentation.

@ivanmkc ivanmkc reopened this May 11, 2022
@product-auto-label product-auto-label bot added api: vertex-ai Issues related to the googleapis/python-aiplatform API. and removed api: aiplatform Issues related to the AI Platform API. labels May 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aiplatform Issues related to the AI Platform (Unified) API. api: vertex-ai Issues related to the googleapis/python-aiplatform API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants