Skip to content

The data store is used to access datasets from the Zenodo API

License

Notifications You must be signed in to change notification settings

xcube-dev/xcube-zenodo

Repository files navigation

xcube-zenodo

Build Status codecov Code style: black License

xcube-zenodo is a Python package and xcube plugin that adds a data store named zenodo to xcube. The data store is used to access datasets which are published on Zenodo.

How to use the xcube-zenodo plugin

Lazy access of datasets published as tif or netcdfs

To access datasets published on Zenodo, find your dataset on the Zenodo webpage and build the data ID with the following structure "<record_id>/<file_name>". For example for the Canopy height and biomass map for Europe the data ID for the dataset "planet_canopy_cover_30m_v0.1.tif" will be given by "8154445/planet_canopy_cover_30m_v0.1.tif". The record ID can be found in the url of the zenodo page. The following few lines of code will lazy load the dataset.

from xcube.core.store import new_data_store

store = new_data_store("zenodo")
ds = store.open_data(
    "8154445/planet_canopy_cover_30m_v0.1.tif",
    tile_size=(1024, 1024)
)

To learn more check out the example notebook zenodo_data_store.ipynb.

Access compressed datasets via the xcube's preload API

If datasets are published as zip, tar, tar.gz, you can use the preload API to preload the data into the local or s3 file system. If the compressed file contains multiple datasets, the data IDs will be extended by one layer. A short example is shown below.

from xcube.core.store import new_data_store

store = new_data_store("zenodo")
handler = store.preload_data("13333034/andorra.zip")
preloaded_data_ids = store.cache_store.list_data_ids()
ds = store.open_data(preloaded_data_ids[0])

To learn more check out the example notebooks zenodo_data_store_preload*.ipynb in examples.

Installing the xcube-zenodo plugin

This section describes three alternative methods you can use to install the xcube-zenodo plugin.

For installation of conda packages, we recommend mamba. It is also possible to use conda, but note that installation may be significantly slower with conda than with mamba. If using conda rather than mamba, replace the mamba command with conda in the installation commands given below.

Installation into a new environment with mamba

This method creates a new environment and installs the latest conda-forge release of xcube-zenodo, along with all its required dependencies, into the newly created environment.

To do so, execute the following commands:

mamba create --name xcube-zenodo --channel conda-forge xcube-zenodo
mamba activate xcube-zenodo

The name of the environment may be freely chosen.

Installation into an existing environment with mamba

This method assumes that you have an existing environment, and you want to install xcube-zenodo into it.

With the existing environment activated, execute this command:

mamba install --channel conda-forge xcube-zenodo

Once again, xcube and any other necessary dependencies will be installed automatically if they are not already installed.

Installation into an existing environment from the repository

If you want to install xcube-zenodo directly from the git repository (for example in order to use an unreleased version or to modify the code), you can do so as follows:

mamba create --name xcube-zenodo --channel conda-forge --only-deps xcube-zenodo
mamba activate xcube-zenodo
git clone https://github.com/xcube-dev/xcube-zenodo.git
python -m pip install --no-deps --editable xcube-zenodo/

This installs all the dependencies of xcube-zenodo into a fresh conda environment, then installs xcube-zenodo into this environment from the repository.

Testing

To run the unit test suite:

pytest

To analyze test coverage:

pytest --cov=xcube_zenodo

To produce an HTML coverage report:

pytest --cov-report html --cov=xcube_zenodo

Some notes on the strategy of unit-testing

The unit test suite uses pytest-recording to mock https requests via the Python library requests. During development an actual HTTP request is performed and the responses are saved in cassettes/**.yaml files. During testing, only the cassettes/**.yaml files are used without an actual HTTP request. During development, to save the responses to cassettes/**.yaml, run

pytest -v -s --record-mode new_episodes

Note that --record-mode new_episodes overwrites all cassettes. If one only wants to write cassettes which are not saved already, --record-mode once can be used. pytest-recording supports all records modes given by VCR.py. After recording the cassettes, testing can be then performed as usual.

About

The data store is used to access datasets from the Zenodo API

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages