Skip to content

Conversation

@j-atkins
Copy link
Collaborator

This PR expands the range of Copernicus products available to virtualship fetch. Reanalysis (and Reanalysis Interim) data are now available for both physical and biogeochemical data, in addition to the existing Analysis & Forecast products.

fetch will now automate the dataset selection based on the times in the expedition schedule (now available from 1993 through to two weeks into the future). Following on from #210, all data is now at daily resolution*.

Following on from the Copernicus Marine documentation, I have incorporated the Reanalysis Interim products to maintain consistency between the Reanalysis and Analysis & Forecast products. In summary the three products cover the following periods:

  1. Reanalysis (or "hindcast" for biogeochemistry): ~30 years ago to ~5 years ago
  2. Renalysis interim (or "hindcast interim" for biogeochemistry): ~5 years ago to ~2 months ago
  3. Analysis & Forecast: ~2 months ago to ~2 weeks in future

*Note also, certain BGC variables (pH, phytoplankton) are only available as monthly products in hindcast and hindcast interim periods.

Given there are now quite a few different combinations of variables, time periods, temporal resolutions etc. I thought it would be helpful to explain this in the documentation, which I have now done (Additional information.md), for transparency on which products are employed when, in case this is ever important to users. There is now a reference to this in the quickstart guide as well.

To avoid complicating things further, the CTD_BGC zooplankton variable is now dropped given its limited temporal range (~2024 onwards). If it's useful to keep this though I can add it back in and add warnings, documentation etc. to be clear that it's limited to this range.


Tests for the new functionality also added, plus existing tests for the CLI and fetch are updated.

Closes #210


```{warning}
In the rare situation where the start and end times of an expedition schedule span different products, which is possible in the case of the end time being in the **Reanalysis_interim** period and the start time in the **Reanalysis** period, the **Analysis & Forecast** product will be automatically selected, as this spans back enough in time for this niche case.
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd think this can also occur between Reanalysis_interim and Analysis & Forecast. Maybe just drop the example: "which is possible..."?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't occur in the case of between Reanalysis_interim and Analysis & Forecast because the product selection is determined by the end time in the schedule, and the Analysis & Forecast product spans back enough in time. To my understanding, it's only between the Reanalysis and Reanalysis_interim products where there is a discontinuity.

That being said, I agree the sentence leaves a bit of ambiguity. I'll update it - thank you! 👍

# Additional information

This file contains additional technical information and guidance not covered in the [Quickstart](https://virtualship.readthedocs.io/en/latest/user-guide/quickstart.html) guide or in the [Tutorials](https://virtualship.readthedocs.io/en/latest/user-guide/tutorials/index.html).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider calling this Documentation. The quick start guide and tutorials should not be the main source of information in my opinion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks!

quickstart
tutorials/index
assignments/index
additional_information
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation

### Waypoint datetimes

```{note}
VirtualShip supports simulating experiments in the years 1993 through to the present day (and up to two weeks in the future) by leveraging the suite of products available Copernicus Marine Data Store (see [Fetch the data](#fetch-the-data)). The data download is automated based on the time period selected in the schedule. Different periods will rely on different products from the Copernicus Marine catalogue (see [Additional information](additional_information.md)).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation

ctd_bgc_download_dict = {
"o2data": {
"dataset_id": "cmems_mod_glo_bgc-bio_anfc_0.25deg_P1D-m",
"dataset_id": select_product_id(**{**bgc_args, "variable": "o2"}),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use bgc here instead of select

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The select_product_id function is also used elsewhere to determine the physical product ID, so I'll leave it as it is. Unless I misunderstand what you're referring to here?

)

# handle the rare situation where start time and end time span different products, which is possible for reanalysis and reanalysis_interim
# in this case, return the analysis product which spans far back enough
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also an option for interim and analysis? Does it make sense to do this check first?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my view, running this check at the end minimises the number of copernicusmarine.open_dataset() calls required because it requires just one call to a specific predetermined product ID. Running the check before any product filtering would require multiple calls to different products to check their time limits, no?

@j-atkins j-atkins requested review from VeckoTheGecko and ammedd and removed request for VeckoTheGecko October 13, 2025 12:01
Copy link
Collaborator

@ammedd ammedd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me. Thanks for incorporating the changes.

@j-atkins j-atkins merged commit 75f13f1 into main Oct 21, 2025
10 of 11 checks passed
@j-atkins j-atkins deleted the to-reanalysis branch October 21, 2025 15:05
j-atkins added a commit that referenced this pull request Oct 23, 2025
Expands the range of Copernicus Marine Data products available to the fetch command. Reanalysis (and Reanalysis Interim) data are now available for both physical and biogeochemical data, in addition to the existing Analysis & Forecast products.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Using Copernicus reanalysis data

3 participants