Skip to content

Conversation

@katjaweigel
Copy link
Contributor

@katjaweigel katjaweigel commented Feb 24, 2025

Description

Time series, trends, and evaluation plots for regional (CORDEX) Historical changes in climate variables for REF
Based on existing monitoring and benchmarking recipes:

  • ref/recipe_portrait_regions.yml based on recipe_portrait.yml
  • ref/recipe_monitor_anncyc_regions.ym based on a part of monitor/recipe_monitor.yml
    and
  • ref/recipe_model_benchmarking_timeseries_region.yml based on model_evaluation/recipe_model_benchmarking_timeseries.yml
  • ref/recipe_model_benchmarking_boxplots_region_trend.yml based on model_evaluation/recipe_model_benchmarking_boxplots.yml
    from Benchmarking recipes (Lauer et al.)

ref/recipe_model_benchmarking_boxplots_region_trend.yml needs some code changes in monitor/multi_datasets.py all others only add settings to the recipes.

AR6 regions (from shape files) used.


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.

New or updated recipe/diagnostic


To help with the number of pull requests:

@bouweandela
Copy link
Member

Could you please check if observational data is already available in obs4MIPs? #3932 (comment) For example, for ERA-5 obs4MIPs provides the following variables: 'psl', 'ta', 'tas', 'ua', 'uas', 'va', 'vas', 'zg' and GPCP-SG version 2.3 is available as "GPCP-V2.3".

@katjaweigel
Copy link
Contributor Author

Could you please check if observational data is already available in obs4MIPs? #3932 (comment) For example, for ERA-5 obs4MIPs provides the following variables: 'psl', 'ta', 'tas', 'ua', 'uas', 'va', 'vas', 'zg' and GPCP-SG version 2.3 is available as "GPCP-V2.3".

Do we have them on Levante @hb326 ? The only one I find in Tier1 is:
/work/bd0854/DATA/ESMValTool2/OBS/Tier1/GPCP-SG/pr_GPCP-SG_L3_v2.3_197901-201710.nc
(no other variables, no ERA5)
Are there obs4MIPS data in a different folder or should I download them or is somebody already doing that?

@katjaweigel
Copy link
Contributor Author

Could you please check if observational data is already available in obs4MIPs? #3932 (comment) For example, for ERA-5 obs4MIPs provides the following variables: 'psl', 'ta', 'tas', 'ua', 'uas', 'va', 'vas', 'zg' and GPCP-SG version 2.3 is available as "GPCP-V2.3".

Do we have them on Levante @hb326 ? The only one I find in Tier1 is: /work/bd0854/DATA/ESMValTool2/OBS/Tier1/GPCP-SG/pr_GPCP-SG_L3_v2.3_197901-201710.nc (no other variables, no ERA5) Are there obs4MIPS data in a different folder or should I download them or is somebody already doing that?

Dear @bouweandela,

I changed GPCP-SG to obs4MIPs now, but unfortunately, I don't manage to get the ERA5 obs4MIPs data: On Friday I first got a timeout, then it changed to this and is still the same now:

download failed
tas_mon_ERA-5_PCMDI_gn_198301-198312.nc ...Downloading
--2025-05-26 09:44:45--  https://esgf-data2.llnl.gov/thredds/fileServer/user_pub_work/obs4MIPs/ECMWF/ERA-5/mon/tas/gn/v20250220/tas_mon_ERA-5_PCMDI_gn_198301-198312.nc
Resolving esgf-data2.llnl.gov (esgf-data2.llnl.gov)... 198.128.245.139
Connecting to esgf-data2.llnl.gov (esgf-data2.llnl.gov)|198.128.245.139|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2025-05-26 09:44:45 ERROR 404: Not Found.

@bouweandela
Copy link
Member

Unfortunately, the only server providing that data is offline..

@katjaweigel
Copy link
Contributor Author

katjaweigel commented May 28, 2025

Unfortunately, the only server providing that data is offline..

It is running again, I try to get them, but I won't manage today anymore (and it is only tas, psl and uas from the variables I use and not all files are there, pr and hus seem to be missing completely, for them I don't get a wget scripe)

@katjaweigel
Copy link
Contributor Author

Unfortunately, the only server providing that data is offline..

It is running again, I try to get them, but I won't manage today anymore (and it is only tas, psl and uas from the variables I use and not all files are there, pr and hus seem to be missing completely, for them I don't get a wget scripe)

And now it is again not possible, to reach the server at all :(

@katjaweigel
Copy link
Contributor Author

Dear @bouweandela I updated ERA(-)5 psl, tas, and ua to obs4MIPs now. The download was rather slow. pr and hus I did not find at all, therefore I did not change the recipes for them (the change would be the same as for the other variables if data would be available:

          - {dataset: ERA-5, project: obs4MIPs, type: atmos, version: PCMDI, tier: 1,
             reference_for_metric: true'}

instead of

          - {dataset: ERA5, project: native6, type: reanaly, version: v1,
             tier: 3, reference_for_metric: true}

)
The data are could download are currently in one of my own directories on Levante, I'm not sure if I should ask to move them to the shared directory?

<<: *var_default_abs
reference_dataset: GPCP-SG
additional_datasets:
- {dataset: GPCP-SG, project: obs4MIPs, type: atmos, version: 2.3, tier: 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- {dataset: GPCP-SG, project: obs4MIPs, type: atmos, version: 2.3, tier: 1,
- {dataset: GPCP-V2.3, project: obs4MIPs, tier: 1,

The name of the obs4MIPs dataset on ESGF is GPCP-V2.3. type is not a required facet for obs4MIPs data. The only available version on ESGF is v20180519, but the version does not need to be specified because ESMValCore will automatically use the latest available version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is GPCP-SG if I download the data from ESGF (only the [THREDDS Catalog] worked for me (and on Levante))?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right that GPCP-SG is used in the filename, but what matters is the value for the source_id, as that is what we call dataset in ESMValCore (see here for the mapping from 'our' facet names to ESGF names). Using source_id=GPCP-V2.3 in an ESGF search returns a file:

{
  "responseHeader":{
    "status":0,
    "QTime":5,
    "params":{
      "df":"text",
      "q.alt":"*:*",
      "indent":"true",
      "echoParams":"all",
      "fl":"*,score",
      "start":"0",
      "fq":["type:File",
        "project:\"obs4MIPs\"",
        "source_id:\"GPCP-V2.3\""],
      "sort":"id asc",
      "rows":"500",
      "q":"*:*",
      "shards":"localhost/solr/files,localhost:8985/solr/files,localhost:8986/solr/files,localhost:8987/solr/files,localhost:8988/solr/files,localhost:8989/solr/files,localhost:8990/solr/files",
      "tie":"0.01",
      "facet.limit":"-1",
      "qf":"text",
      "facet.method":"enum",
      "facet.mincount":"1",
      "wt":"json",
      "facet":"true",
      "facet.sort":"lex"}},
  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
      {
        "id":"obs4MIPs.NASA-GSFC.GPCP.atmos.mon.v20180519.pr_GPCP-SG_L3_v2.3_197901-201710.nc|dpesgf03.nccs.nasa.gov",
        "version":"1",
        "cf_standard_name":["precipitation_flux"],
        "checksum":["4dd4678b79ef139446c8406da5aae4fed210abb2f2160ef95f6988bf83e4525b"],
        "checksum_type":["SHA256"],
        "data_node":"dpesgf03.nccs.nasa.gov",
        "dataset_id":"obs4MIPs.NASA-GSFC.GPCP-V2.3.atmos.mon.v20180519|dpesgf03.nccs.nasa.gov",
        "dataset_id_template_":["%(project)s.%(institute)s.%(source_id)s.%(realm)s.%(time_frequency)s"],
        "index_node":"esgf-node.llnl.gov",
        "instance_id":"obs4MIPs.NASA-GSFC.GPCP.atmos.mon.v20180519.pr_GPCP-SG_L3_v2.3_197901-201710.nc",
        "institute":["NASA-GSFC"],
        "latest":true,
        "master_id":"obs4MIPs.NASA-GSFC.GPCP.atmos.mon.pr_GPCP-SG_L3_v2.3_197901-201710.nc",
        "model":["Obs-GPCP"],
        "product":["observations"],
        "project":["obs4MIPs"],
        "realm":["atmos"],
        "replica":false,
        "size":19348352,
        "source_id":["GPCP-V2.3"],
        "time_frequency":["mon"],
        "timestamp":"2018-02-17T17:17:54Z",
        "title":"pr_GPCP-SG_L3_v2.3_197901-201710.nc",
        "tracking_id":["4070c751-6c2d-440f-a4d7-5b325fb98990"],
        "type":"File",
        "url":["https://dpesgf03.nccs.nasa.gov/thredds/fileServer/obs4MIPs/observations/NASA-GSFC/Obs-GPCP/GPCP/V2.3/atmos/pr/pr_GPCP-SG_L3_v2.3_197901-201710.nc|application/netcdf|HTTPServer"],
        "variable":["pr"],
        "variable_long_name":["Precipitation"],
        "variable_units":["kg m-2 s-1"],
        "_version_":1662343940370595840,
        "retracted":false,
        "_timestamp":"2020-03-27T18:45:20.965Z",
        "score":1.0}]
  },
  "facet_counts":{
    "facet_queries":{},
    "facet_fields":{},
    "facet_ranges":{},
    "facet_intervals":{},
    "facet_heatmaps":{}}}

while a search using source_id=GPCP-SG in an ESGF search returns no files:

{
  "responseHeader":{
    "status":0,
    "QTime":8,
    "params":{
      "df":"text",
      "q.alt":"*:*",
      "indent":"true",
      "echoParams":"all",
      "fl":"*,score",
      "start":"0",
      "fq":["type:File",
        "project:\"obs4MIPs\"",
        "source_id:\"GPCP-SG\""],
      "sort":"id asc",
      "rows":"500",
      "q":"*:*",
      "shards":"localhost/solr/files,localhost:8985/solr/files,localhost:8986/solr/files,localhost:8987/solr/files,localhost:8988/solr/files,localhost:8989/solr/files,localhost:8990/solr/files",
      "tie":"0.01",
      "facet.limit":"-1",
      "qf":"text",
      "facet.method":"enum",
      "facet.mincount":"1",
      "wt":"json",
      "facet":"true",
      "facet.sort":"lex"}},
  "response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]
  },
  "facet_counts":{
    "facet_queries":{},
    "facet_fields":{},
    "facet_ranges":{},
    "facet_intervals":{},
    "facet_heatmaps":{}}}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But when I write "dataset: GPCP-V2.3" in the recipe it does not run (maybe it would, if I'd allow download?) but it does with dataset: GPCP-SG (because on Levante, these data are available)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you try running it with --search-esgf=always?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the answer @bouweandela ! I tried again with search_esgf: when_missing in config, then it works with "GPCP-V2.3". But it does not work any more, when I set it to search_esgf: never to run the next recipe, which needs the same data (or even to rerun the same recipe again). Then it still does not find the data? The reason seem to be, that it does not create a folder. It puts the data under:
(esmvaltool) [b380216@levante1 ESMValTool]$ ls -lhrt /work/bd1083/b380216/extraobsv2/Tier1/
total 19M
drwxr-sr-x 2 b380216 bd0854 20K Jun 15 12:28 ERA-5
-rwx------ 1 b380216 bd0854 19M Jun 25 15:39 pr_GPCP-SG_L3_v2.3_197901-201710.nc

When I move the data by hand into a sub-folder called "GPCP-V2.3" it works, so I change it in these recipes now to GPCP-V2.3 now.
But I wonder if we need to change something for the downloader in cases where file name and the source_id disagree
and/or if we then should rename?
/work/bd0854/DATA/ESMValTool2/OBS/Tier1/GPCP-SG
to
/work/bd0854/DATA/ESMValTool2/OBS/Tier1/GPCP-V2.3
?
But this would cause other recipes, which currently use "GPCP-SG" not to work anymore on Levante?
These recipes are:

  • esmvaltool/recipes/recipe_model_benchmarking_timeseries_pr_e.yml
  • esmvaltool/recipes/examples/recipe_check_obs.yml
  • esmvaltool/recipes/ipccwg1ar5ch9/recipe_flato13ipcc_figure_96.yml
  • esmvaltool/recipes/ipccwg1ar6ch3/recipe_ipccwg1ar6ch3_atmosphere.yml
  • esmvaltool/recipes/model_evaluation/recipe_model_benchmarking_boxplots.yml
  • esmvaltool/recipes/model_evaluation/recipe_model_benchmarking_maps.yml
  • esmvaltool/recipes/model_evaluation/recipe_model_evaluation_basics.yml
  • esmvaltool/recipes/model_evaluation/recipe_model_evaluation_clouds_clim.yml
  • esmvaltool/recipes/model_evaluation/recipe_model_evaluation_precip_zonal.yml
    -esmvaltool/recipes/ref/recipe_ref_scatterplot.yml

There is also still the issue, that I cannot find pr and hus as cmorized ERA5 data?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it does not work any more, when I set it to search_esgf: never to run the next recipe, which needs the same data (or even to rerun the same recipe again).

You need to set the rootpath and DRS so it matches the directory tree in your config-user.yml file. For example, mine looks like this for the obs4MIPs project on Levante:

rootpath:
  obs4MIPs:
    /work/bd0854/DATA/ESMValTool2/OBS: default
    /work/bd0854/DATA/ESMValTool2/download: ESGF

I wonder if we need to change something for the downloader

The downloader uses DRS ESGF, for obs4MIPs that is https://github.com/ESMValGroup/ESMValCore/blob/4d8fcbfb331e0745db58321cc41fe7de88c82393/esmvalcore/config-developer.yml#L110-L119
to store files. As you can see, the only requirement on the filename is that it starts with the short_name for that DRS setting, so what comes after it should't matter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no issue with the file name, but with the name of the directory, which is usually the data set name:
It downloads the data to:
Tier{tier}/
but expects: 'Tier{tier}/{dataset}' for finding the data, if they are not downloaded in the same run.
I'd expect that most users don't change config_developer (because it is part of ESMValCore) so if I use dataset: GPCP-V2.3 it expects this as directory name and won't find the data if they are directly under Tier{tier}/ or in /work/bd0854/DATA/ESMValTool2/OBS/Tier1/GPCP-SG ?

@bouweandela bouweandela merged commit 3865eda into main Jul 11, 2025
7 checks passed
@bouweandela bouweandela deleted the regional_historical_changes_for_REF branch July 11, 2025 08:12
@bouweandela
Copy link
Member

Thanks everyone! I noticed that the data is not written out for ref/recipe_ref_trend_regions.yml, only the plots, so it would be nice to add that in the future so people viewing the REF output can download the numbers too.

@katjaweigel
Copy link
Contributor Author

katjaweigel commented Jul 14, 2025

@bouweandela thanks for merging! There seem to be generally no data written with the plots for seaborn_diag.py (I tried the standard recipe recipe_seaborn.yml). Is writing data generally expected for all recipes in ESMValTool or is this only/mainly an issue for the REF? I probably should open a new issue for this then?

@bouweandela
Copy link
Member

Is writing data generally expected for all recipes in ESMValTool

Yes, see our contribution guidelines. For the REF it is even more important because most users of REF output will not be capable of running the recipe themselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved by scientific reviewer approved by technical reviewer new recipe Use this label if you are adding a new recipe REF Important for the CMIP Rapid Evaluation Framework (REF)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regional (CORDEX) Historical changes in climate variables (time series, trends) for REF

9 participants