Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Files Needed for CropHarvestMultiClassValidation Class #38

Open
mahrokh3409 opened this issue Jun 27, 2024 · 7 comments
Open

Comments

@mahrokh3409
Copy link

Hi @gabrieltseng

I am currently working on implementing the CropHarvestMultiClassValidation class within presto/eval/cropharvest_eval.py. To facilitate this, I require access to the data accessible via the download_cropharvest_data() function.

However, I am encountering difficulties accessing the "features/dynamic_world_arrays" and "test_dynamic_world_features" files necessary for this task. Could you please provide me with direct links or alternative methods to download these folders?

Your assistance in resolving this access issue would be greatly appreciated.

Kind Regards,
Mahrokh

@mahrokh3409
Copy link
Author

Hi @gabrieltseng , @kvantricht , @rubencart , and @sabman
I am currently working on implementing the CropHarvestMultiClassValidation class within presto/eval/cropharvest_eval.py. To facilitate this, I require access to the data accessible via the download_cropharvest_data() function.

However, I am encountering difficulties accessing the "features/dynamic_world_arrays" and "test_dynamic_world_features" files necessary for this task. Could you please provide me with direct links or alternative methods to download these folders?

Your assistance in resolving this access issue would be greatly appreciated.

Kind Regards,
Mahrokh

@gabrieltseng
Copy link
Collaborator

Hi @mahrokh3409 ,

The dynamic world data needs to be re-exported from Google Earth Engine. This can be done by calling the export_dynamic_world function in the CropHarvestEval task.

You then need to transform the tif files you receive from EarthEngine into npy arrays - this can be achieved via the dynamic_world_tifs_to_npy function in the CropHarvestEval task.

For the test data, you will need to use the create_dynamic_world_test_h5_instances function.

So the flow is:

  1. Export tifs from EarthEngine
  2. Download them from your google cloud
  3. Transform them into npy and h5 files

I hope this helps!

@mahrokh3409
Copy link
Author

Hi @gabrieltseng,

Thanks for your response. My main issue is related to the download function and access to the bucket on Google Cloud. It has a permission error.

def download_cropharvest_data(root_name: str = ""):
root = Path(root_name) if root_name != "" else cropharvest_data_dir()
if not root.exists():
root.mkdir()
CropHarvest(root, download=True)
for gcloud_path in ["features/dynamic_world_arrays", "test_dynamic_world_features"]:
if not (root / gcloud_path).exists():
blob = (
storage.Client().bucket(TAR_BUCKET).blob(f"eval/cropharvest/{gcloud_path}.tar.gz")
)
blob.download_to_filename(root / f"{gcloud_path}.tar.gz")
extract_archive(root / f"{gcloud_path}.tar.gz", remove_tar=True)

The export_dynamic_world function also calls the above function to download files. Files related to CropHarvest are downloaded successfully and I have access to "features" and "test_features" data. However, the second part of the code (highlighted parts) generates an error as you can see below:

Forbidden: 403 GET https://storage.googleapis.com/download/storage/v1/b/lem-assets2/o/eval%2Fcropharvest%2Ffeatures%2Fdynamic_world_arrays.tar.gz?alt=media: [email protected] does not have storage.objects.get access to the Google Cloud Storage object. Permission 'storage.objects.get' denied on resource (or it may not exist).: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)

Can you please let me know how I can access those files? Is there any other way of accessing those files?

I appreciate your help.

Kind Regards,
Mah

@mahrokh3409
Copy link
Author

image

@gabrieltseng
Copy link
Collaborator

gabrieltseng commented Jul 3, 2024

Hi @mahrokh3409 , this is expected. Did you go through the steps to download the export the data from Earth Engine into a google cloud project (as described above)? If not you will not have any data to download.

Google Cloud Bucket names are globally unique, so you will need to change the bucket / folder names being exported to.

These are defined in the following places:

@mahrokh3409
Copy link
Author

Dear @gabrieltseng

Thank you so much for your response and detailed guidance. I will follow the instructions and let you know if there are any problems

Kind Regards,
Mahrokh

@mahrokh3409
Copy link
Author

mahrokh3409 commented Jul 4, 2024

Hi @gabrieltseng

I tried below steps:
I created three buckets as below
image

and I updated the related files with bucket names:

• Presto/presto/dataops/pipelines/ee_pipeline.py

image

• Presto/presto/dataops/dataset.py

image

Then I called export_dynamic_world via below code
import ee
ee.Authenticate()
ee.Initialize()
CropHarvestEval.export_dynamic_world(test=False)

below is the screenshot
image

there was no error, however, as you can see the dynamic world folder is empty
image

I am really stuck at this stage and not sure what steps I should take to obtain those files. I appreciate your help

Kind Regards,
Mahrokh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants