-
Notifications
You must be signed in to change notification settings - Fork 145
Benchmarking recipes (Lauer et al.) #3598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 108 commits
Commits
Show all changes
116 commits
Select commit
Hold shift + click to select a range
94c826e
quick and dirty implementation of diurnal cycle
axel-lauer 7c7a2e6
added diurnal cycle plot to monitor.py
axel-lauer 8a73469
added diurnal cycle example to model_evaluation/recipe_model_evaluati…
axel-lauer 2daf3a9
added docu examples diurnal cycle
axel-lauer 244d8d8
fixed style issues in monitor/multi_datasets.py
axel-lauer f45b25d
fixed typo in docu example
axel-lauer 530106e
draft version of first benchmarking recipe (maps)
axel-lauer c8aa55c
snapshot 2024-02-01
axel-lauer 080b8f5
Merge branch 'main' into diurnal_cycle
diegokam 40d9167
snapshot 2024-02-02
axel-lauer e707342
first working version
axel-lauer 904b291
fixed some flake8 issues
axel-lauer a7ab4e4
adding benchmarking boxplot
LisaBock b9b0a40
Merge branch 'benchmarking_boxplot' into benchmarking_maps4monitoring
LisaBock b25b9c6
extract plotting function
LisaBock ec4b1c1
added draft of recipe_model_benchmarking_timeseries.yml
axel-lauer 2438b26
fix filename
LisaBock 83d972e
Merge branch 'benchmarking_maps4monitoring' of github.com:ESMValGroup…
LisaBock 30b8453
boxplots for more variables
LisaBock b864979
mv recipe
LisaBock dddc3a5
added zonal mean benchmarking plot
axel-lauer a8c5e1e
merged with lastest branch
axel-lauer d154eed
fixed some flake8 issues
axel-lauer 128a77e
updated zonal mean benchmarking recipe
axel-lauer 4ccc12c
addressing review comments
axel-lauer a99b522
Merge branch 'main' into diurnal_cycle
schlunma 413cb61
clean recipe
LisaBock ed1e991
add var order and different distance metrics
LisaBock 1241f20
first version of plot benchmarking_timeseries
axel-lauer dff982e
added benchmarking annual cycle plot
axel-lauer 66a4bc5
added benchmarking diurnal cycle plot
axel-lauer 50e498b
addressed some style issues
axel-lauer 446b4ee
updated benchmarking recipes
axel-lauer b37e9b3
snapshot 2024-03-07
axel-lauer 9455bf9
updated masking of bias data for benchmarking
axel-lauer 2d92633
bugfix diag_scripts/clouds/clouds.ncl
axel-lauer ec23f76
remove unit if 1 from boxplots
LisaBock b3df631
change plotname for boxplots
LisaBock 5001ac9
adjusting the recipes to use an EMAC simulation for benchmarking
hb326 d7653fc
adjusting so that EMAC can be used as model to be benchmarked
hb326 da794dc
Merge branch 'benchmarking_maps4monitoring' of github.com:ESMValGroup…
axel-lauer b616302
adding a preprocessor that filters EMAC's negative temperatures
hb326 7be6551
update recipe_model_benchmarking_diurnal_cycle.yml
axel-lauer 013926a
Merge branch 'benchmarking_maps4monitoring' of github.com:ESMValGroup…
axel-lauer 99b5d19
updates for EMAC comparison
axel-lauer d22c0e7
more updates for EMAC comparison
axel-lauer c71ef02
updates boxplots for EMAC comparison
axel-lauer e95e9e0
update recipe for boxplots
axel-lauer 118d0a1
added default colorbar for sst
axel-lauer ff56640
preparing benchmarking recipes for PR
axel-lauer 70012f9
added docu draft (no images)
axel-lauer ec9a1d9
merged with branch diurnal_cycle
axel-lauer 69337fe
fixed merging conflicts
axel-lauer 7f4dbdf
added example plots for benchmarking recipes
axel-lauer fc99b85
updated docu
axel-lauer d4a75e1
updated recipes
axel-lauer f807a0e
fixed some flake8 and pylint issues
axel-lauer be0f566
added zorder in _plot_benchmarking_boxplot
axel-lauer 6592cc1
fixed style issue in cloud.ncl
axel-lauer 13b5312
updated docu figures
axel-lauer 6ac3bae
Merge branch 'main' into benchmarking_maps4monitoring
axel-lauer ffb8b8e
Update multi_datasets.py
axel-lauer d5c5375
Update recipe_benchmarking.rst
axel-lauer 4723d44
Merge branch 'main' into benchmarking_maps4monitoring
axel-lauer eae63ca
Update recipe_benchmarking.rst
axel-lauer 714b349
Merge branch 'main' into benchmarking_maps4monitoring
alistairsellar 7a3c844
Update recipe_benchmarking.rst
axel-lauer 1fbdf10
Update docu (recipe_benchmarking.rst)
axel-lauer 5ec42a2
Merge branch 'main' into benchmarking_maps4monitoring
axel-lauer 8d6a8d0
removed blank line
axel-lauer e03db0d
added Lukas Ruhe to config-references.yml
axel-lauer 9db0c2c
add seaborn boxplot link
LisaBock 8d6bb2d
Update esmvaltool/recipes/model_evaluation/recipe_model_benchmarking_…
axel-lauer 6483f8d
Update esmvaltool/recipes/model_evaluation/recipe_model_benchmarking_…
axel-lauer 4f247e8
Update esmvaltool/config-references.yml
axel-lauer f1c846b
changed author ruhe_lukas to lindenlaub_lukas
axel-lauer 53c1451
renamed reference lauer24gmd to lauer25gmd
axel-lauer 42c1989
Merge branch 'benchmarking_maps4monitoring' of github.com:ESMValGroup…
axel-lauer e39e12a
updated docu recipe_benchmarking.rst
axel-lauer 9ee9cdd
updated with main
axel-lauer 2cb194c
added more docu to multi_datasets.py
axel-lauer bce0f86
fixed some docu issues
axel-lauer b003117
Update doc/sphinx/source/recipes/recipe_benchmarking.rst
axel-lauer 150011b
Update esmvaltool/diag_scripts/monitor/multi_datasets.py
axel-lauer b1c8985
removed unused default settings
axel-lauer fbd98d8
Merge branch 'benchmarking_maps4monitoring' of github.com:ESMValGroup…
axel-lauer 146b0d0
removed commented out code from multi_datasets.py
axel-lauer 295cb9a
fixed some flake8 issues in multi_datasets.py
axel-lauer 0d5fcef
adjusted path to diag script for portrait diagram to match PR #3551
axel-lauer 4a817e8
Remove unused options from benchmarking maps and zonal plots
schlunma da473da
Allow datasets w/o timerange for benchmarking diags (see #3528)
schlunma 5f6c3e9
Fix contourf plots (see #3797 and #3789)
schlunma e647152
More flexible font sizes (see #3844)
schlunma 2e9ef36
Make sure that boxplots are actually created
schlunma 378c313
Properly format figure captions for model evaluation recipe doc
schlunma 6c166a9
Delete superfluous ':'
schlunma 5ff0d45
Use YAML syntax for YAML code
schlunma baa2096
Minor doc changes
schlunma 2c44d27
Re-add default show_stats for zonal mean plot
schlunma 51f9d1b
Do not use ERA5 in monitor recipe so it can be run with bot
schlunma 2550c05
Fix doc build
schlunma 4b8c65f
Merge branch 'main' into benchmarking_maps4monitoring
schlunma 6414a7f
changed reference lauer25gmd to preprint version until article is pub…
axel-lauer d58b02a
update docs
axel-lauer 0240bbb
Make portrait plot work
schlunma 664417e
added info on benchmark_dataset: true to multi_datasets.py
axel-lauer 92f09a9
Merge branch 'benchmarking_maps4monitoring' of github.com:ESMValGroup…
axel-lauer d230fd0
remove recipe_lauer25gmd_fig*.yml, now available at 10.5281/zenodo.11…
axel-lauer bfa5873
Fix flake8 issues
schlunma 0d862a2
removed EMAC from recipes
axel-lauer 78b6021
Merge branch 'benchmarking_maps4monitoring' of github.com:ESMValGroup…
axel-lauer 8417240
removed commented out lines in recipe_model_evaluation_portraits.yml
axel-lauer a221dcb
removed recipe_model_evaluation_portraits.yml from this PR to avoid d…
axel-lauer 1c9e071
Merge branch 'main' into benchmarking_maps4monitoring
axel-lauer 78550d0
Merge branch 'main' into benchmarking_maps4monitoring
schlunma 33b722a
Merge branch 'main' into benchmarking_maps4monitoring
schlunma File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+150 KB
...inx/source/recipes/figures/model_evaluation/diurnal_cycle_clt_sepacific_3hr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+151 KB
doc/sphinx/source/recipes/figures/monitor/diurnal_cycle_clt_tropics_3hr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+83.7 KB
...s/figures/monitor/diurnalcycle_pr_tropics_EC-Earth3_3hr_historical_r1i1p1f1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,175 @@ | ||
| .. _recipe_benchmarking: | ||
|
|
||
| Model Benchmarking | ||
| ================== | ||
|
|
||
| Overview | ||
| -------- | ||
|
|
||
| These recipes and diagnostics are based on :ref:`recipe_monitor <recipe_monitor>` that allow plotting arbitrary preprocessor output, i.e., arbitrary variables from arbitrary datasets. An extension of these diagnostics is used to benchmark a model simulation with other datasets (e.g. CMIP6). The benchmarking features are described in `Lauer et al.`_. | ||
|
|
||
| .. _`Lauer et al.`: https://doi.org/10.5194/egusphere-2024-1518 | ||
|
|
||
| Available recipes and diagnostics | ||
| --------------------------------- | ||
|
|
||
| Recipes are stored in `recipes/model_evaluation` | ||
|
|
||
| * recipe_model_benchmarking_annual_cycle.yml | ||
| * recipe_model_benchmarking_boxplots.yml | ||
| * recipe_model_benchmarking_diurnal_cycle.yml | ||
| * recipe_model_benchmarking_maps.yml | ||
| * recipe_model_benchmarking_timeseries.yml | ||
| * recipe_model_benchmarking_zonal.yml | ||
|
|
||
| Diagnostics are stored in `diag_scripts/monitor/` | ||
|
|
||
| * :ref:`multi_datasets.py | ||
| <api.esmvaltool.diag_scripts.monitor.multi_datasets>`: | ||
| Monitoring diagnostic to show multiple datasets in one plot (incl. biases). | ||
|
|
||
|
|
||
| Recipe settings | ||
| ~~~~~~~~~~~~~~~ | ||
|
|
||
| See :ref:`multi_datasets.py<api.esmvaltool.diag_scripts.monitor.multi_datasets>` for a list of all possible configuration options that can be specified in the recipe. | ||
|
|
||
| .. note:: | ||
axel-lauer marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Please note that exactly one dataset (the dataset to be benchmarked) needs to specify the facet ``benchmark_dataset: true`` in the dataset entry of the recipe. For line plots (i.e. annual cycle, diurnal cycle, time series), it is recommended, to specify a particular line color and line style in the ``scripts`` section of the recipe for the dataset to be benchmarked (``benchmark_dataset: true``) so that this dataset is easy to identify in the plot. In the example below, MIROC6 is the dataset to be benchmarked and ERA5 is used as a reference dataset. | ||
|
|
||
| .. code-block:: yaml | ||
| scripts: | ||
| allplots: | ||
| script: monitor/multi_datasets.py | ||
| plot_folder: '{plot_dir}' | ||
| plot_filename: '{plot_type}_{real_name}_{mip}' | ||
| group_variables_by: variable_group | ||
| facet_used_for_labels: alias | ||
| plots: | ||
| diurnal_cycle: | ||
| annual_mean_kwargs: false | ||
| legend_kwargs: | ||
| loc: upper right | ||
| plot_kwargs: | ||
| 'MIROC6': | ||
| color: red | ||
| label: '{alias}' | ||
| linestyle: '-' | ||
| linewidth: 2 | ||
| zorder: 4 | ||
| ERA5: | ||
| color: black | ||
| label: '{dataset}' | ||
| linestyle: '-' | ||
| linewidth: 2 | ||
| zorder: 3 | ||
| MultiModelPercentile10: | ||
| color: gray | ||
| label: '{dataset}' | ||
| linestyle: '--' | ||
| linewidth: 1 | ||
| zorder: 2 | ||
| MultiModelPercentile90: | ||
| color: gray | ||
| label: '{dataset}' | ||
| linestyle: '--' | ||
| linewidth: 1 | ||
| zorder: 2 | ||
| default: | ||
| color: lightgray | ||
| label: null | ||
| linestyle: '-' | ||
| linewidth: 1 | ||
| zorder: 1 | ||
| Variables | ||
| --------- | ||
|
|
||
| Any, but the variables' number of dimensions should match the ones expected by each plot. | ||
|
|
||
| References | ||
| ---------- | ||
|
|
||
| * Lauer, A., L. Bock, B. Hassler, P. Jöckel, L. Ruhe, and M. Schlund: Monitoring and benchmarking Earth | ||
| System Model simulations with ESMValTool v2.12.0, EGUsphere [preprint], https://doi.org/10.5194/egusphere-2024-1518, 2024. | ||
|
|
||
| Example plots | ||
| ------------- | ||
|
|
||
| .. _fig_benchmarking_annual_cycle: | ||
| .. figure:: /recipes/figures/benchmarking/annual_cycle.png | ||
| :align: center | ||
| :width: 16cm | ||
|
|
||
| (Left) Multi-year global mean (2000-2004) of the seasonal cycle of near-surface temperature | ||
| in K from a simulation of MIROC6 and the reference dataset HadCRUT5 (black). The thin gray | ||
| lines show individual CMIP6 models used for comparison, the dashed gray lines show the 10% | ||
| and 90% percentiles of these CMIP6 models. (Right) same as (left) but for area-weighted RMSE | ||
| of near-surface temperature. The light blue shading shows the range of the 10% to 90% | ||
| percentiles of RMSE values from the ensemble of CMIP6 models used for comparison. Created | ||
| with recipe_model_benchmarking_annual_cycle.yml. | ||
|
|
||
| .. _fig_benchmarking_boxplots: | ||
| .. figure:: /recipes/figures/benchmarking/boxplots.png | ||
| :align: center | ||
| :width: 16cm | ||
|
|
||
| (Left) Global area-weighted RMSE (smaller=better), (middle) weighted Pearson’s correlation | ||
| coefficient (higher=better) and (right) weighted Earth mover’s distance (smaller=better) of | ||
| the geographical pattern of 5-year means of different variables from a simulation of MIROC6 | ||
| (red cross) in comparison to the CMIP6 ensemble (boxplot). Reference datasets for calculating | ||
| the three metrics are: near-surface temperature (tas): HadCRUT5, surface temperature (ts): | ||
| HadISST, precipitation (pr): GPCP-SG, air pressure at sea level (psl): ERA5, shortwave (rsut) | ||
| longwave (rlut) radiative fluxes at TOA and shortwave (swcre) and longwave (lwcre) cloud | ||
| radiative effects: CERES-EBAF. Each box indicates the range from the first quartile to the | ||
| third quartile, the vertical lines show the median, and the whiskers the minimum and maximum | ||
| values, excluding the outliers. Outliers are defined as being outside 1.5 times the | ||
| interquartile range. Created with recipe_model_benchmarking_boxplots.yml. | ||
|
|
||
| .. _fig_benchmarking_diurn_cycle: | ||
| .. figure:: /recipes/figures/benchmarking/diurnal_cycle.png | ||
| :align: center | ||
| :width: 10cm | ||
|
|
||
| Area-weighted RMSE of the annual mean diurnal cycle (year 2000) of precipitation averaged over | ||
| the tropical ocean (ocean grid cells in the latitude belt 30°S to 30°N) from a simulation of | ||
| MIROC6 averaged compared with ERA5 data (black). The light blue shading shows the range of the | ||
| 10% to 90% percentiles of RMSE values from the ensemble of CMIP6 models used for comparison. | ||
| Created with recipe_benchmarking_diurnal_cycle.yml. | ||
|
|
||
| .. _fig_benchmarking_map: | ||
| .. figure:: /recipes/figures/benchmarking/map.png | ||
| :align: center | ||
| :width: 10cm | ||
|
|
||
| 5-year annual mean (2000-2004) area-weighted RMSE of the precipitation rate in mm day-1 from a | ||
| simulation of MIROC6 compared with GPCP-SG data. The stippled areas mask grid cells where the | ||
| RMSE is smaller than the 90% percentile of RMSE values from an ensemble of CMIP6 models. | ||
| Created with recipe_model_benchmarking_maps.yml | ||
|
|
||
| .. _fig_benchmarking_timeseries: | ||
| .. figure:: /recipes/figures/benchmarking/timeseries.png | ||
| :align: center | ||
| :width: 16cm | ||
|
|
||
| (Left) Time series from 2000 through 2014 of global average monthly mean temperature anomalies | ||
| (reference period 2000-2009) of the near-surface temperature in K from a simulation of MIROC6 | ||
| (red) and the reference dataset HadCRUT5 (black). The thin gray lines show individual CMIP6 | ||
| models used for comparison, the dashed gray lines show the 10% and 90% percentiles of these | ||
| CMIP6 models. (Right) same as (left) but for area-weighted RMSE of the near-surface air | ||
| temperature. The light blue shading shows the range of the 10% to 90% percentiles of RMSE | ||
| values from the ensemble of CMIP6 models used for comparison. Created with | ||
| recipe_model_benchmarking_timeseries.yml. | ||
|
|
||
| .. _fig_benchmarking_zonal: | ||
| .. figure:: /recipes/figures/benchmarking/zonal.png | ||
| :align: center | ||
| :width: 10cm | ||
|
|
||
| 5-year annual mean bias (2000-2004) of the zonally averaged temperature in K from a historical | ||
| simulation of MIROC6 compared with ERA5 reanalysis data. The stippled areas mask grid cells | ||
| where the absolute BIAS (:math:`|BIAS|`) is smaller than the maximum of the absolute 10% | ||
| (:math:`|p10|`) and the absolute 90% (:math:`|p90|`) percentiles from an ensemble of CMIP6 | ||
| models, i.e. :math:`|BIAS| \geq max( |p10|, |p90|)`. Created with | ||
| recipe_model_benchmarking_zonal.yml. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.