Update environment for rainforests calibration #2136

rachaelesler · 2025-06-24T00:40:17Z

This patch provides changes to the apply_rainforests_calibration plugin in order to support the IMPROVER repo environment update (see Github issue #150).

Specifically, this patch sets lower bound versions for the treelite and lightgbm packages to support an up-to-date Python environment. Corresponding code changes for these version upgrades are included. Furthermore, I have added a few minor quality changes to the rainforests code - see below list.

This changeset includes:

Set lower bound for treelite and lightgbm package versions
Add tl2cgen package (needed for treelite v4.0.0 and up)
Changes to rainforests calibration code to support updated package versions
Update docs to reflect new package versions
Quality of life changes to rainforests calibration code:
- Ensure correct type annotations are used
- Use custom exceptions where feasible
- Use StrEnum for model name to constrain possible argument values
- Improved error handling
- Additional clarity in docstrings
Changes to rainforests calibration unit tests:
- Update unit tests to support the above changes
- Add additional rainforests calibration unit tests
- Move some code that supports unit tests, to avoid duplication

Relates to Github issue #150

Testing:

Ran tests and they passed OK (rainforests calibration unit tests only - acceptance tests TODO)
Added new tests for the new feature(s)

CLA

If a new developer, signed up to CLA

improver/calibration/rainforest_calibration.py

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibration.py

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibrationLightGBM.py

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibrationTreelite.py

gavinevans · 2025-06-24T10:14:49Z

CONTRIBUTING.md

 - Mark Worsfold (Met Office, UK)
 - Bruce Wright (Met Office, UK)
 - Ying Zhao (Bureau of Meteorology, Australia)
+ - Rachael Esler (Bureau of Meteorology, Australia)


Just a quick comment, this list should be alphabetical by surname.

Thanks for letting me know Gavin - have updated this in 6cf4c2e.

btrotta-bom

The updates to use the new API look fine, and the PR also makes some good code quality updates. I had a few minor suggestions about the wording in the docstrings, and also I'm not sure about the refactor of the parametrised test.

btrotta-bom · 2025-06-25T23:48:02Z

doc/source/Dependencies.rst


 Treelite
 ~~~~~~~~~~~~~~~~~~
+Lightweight decision tree forest model specification format, used for


It would be good to specify thatt it's a binary format (i.e. not a text file that you can read as a human like lightgbm). Also I suggest removing "forest" - this implies that lightgbm models are random forests, but they are actually gradient-boosted decision trees, which is a different thing (lightgbm can also do random forests, but this feature is seldom used because GBDTs are generally better).

Suggested change: "Lightweight binary format for specifying decision tree models"

Great clarification, I have updated to use your suggestion in the latest changes.

btrotta-bom · 2025-06-25T23:54:20Z

doc/source/Dependencies.rst

+TL2cgen (TreeLite 2 C GENerator)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Model compiler for decision tree ensembles, used for more efficient computation
 of GBDT ensembles required for RainForests calibration.


Suggest changing both instances of "ensembles" to "models"; in ML, "ensemble" would usually mean a collection of multiple models (each of which could be a GBDT consisting of many trees).

Good idea, have updated this now

btrotta-bom · 2025-06-26T00:40:26Z

improver/calibration/rainforest_calibration.py


        Args:
-            key_name: 'treelite_model' or 'lightgbm_model' are the expected names.
+            key_name: One of "treelite_model" or "lightgbm_model".


Here the key_name argument is a string, but in the function type hint it's a ModelKeyName object. Is this intentional?

I've updated the key_name argument to use the typing.Literal Model so hopefully it makes a bit more sense now.

btrotta-bom · 2025-06-26T00:45:17Z

improver/calibration/rainforest_calibration.py

+    pass
+
+
+class ModelKeyName(StrEnum):


This is great, I did not know about the StrEnum class

A lightweight alternative is to use typing.Literal:

from typing import Literal Model = Literal["lightgbm_model", "treelite_model"]

I've updated the code to use typing.Literal now.

btrotta-bom · 2025-06-26T01:22:06Z

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibration.py

+        monkeypatch.setattr(lightgbm, "Booster", MockBooster)
+        monkeypatch.setitem(sys.modules, "tl2cgen", None)
+        result = ApplyRainForestsCalibration(model_config)
+        assert type(result).__name__ == "ApplyRainForestsCalibrationLightGBM"


Is there a reason not to check the type directly (rather than the name)? E.g.

assert type(result) == ApplyRainForestsCalibrationLightGBM

(and similarly for tests below)

Good call, I've changed it so the __name__ property is not checked now. I had initially kept it as it was to keep the tests as unchanged from the original as possible, but checking the type rather than the name of the type is the way to go.

btrotta-bom · 2025-06-26T02:58:41Z

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibration.py

+        assert type(result).__name__ == "ApplyRainForestsCalibrationLightGBM"

+    def test_correct_class_when_treelite_available(self, monkeypatch, model_config):
+        """ "Test that the ApplyRainForestsCalibration constructor creates an


Good pickup, have removed

btrotta-bom · 2025-06-26T03:10:32Z

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibration.py

-        # missing across any thresholds
-        model_config["24"]["0.0000"].pop("treelite_model", None)
-    if not lightgbm_keys:
+    def test_treelite_model_and_module_available(self, monkeypatch, model_config):


I can see why you've refactored this, since the current setup with the multiple parametrizations is a bit hard to follow. However, I think the new tests miss some cases that were covered by the old approach. The parametrization gives 2 x 2 x 2 = 8 cases, whereas the new tests test only 4 cases. For example, the new tests do not test the case where the treelite module is available but the model config does not specify treelite files. I think the easiest solution here may be to restore the old test with the parametrizations, but maybe rewrite it slightly (e.g. change variable names and/or update docstrings) to make it clearer what each parametrization does.

Also, do these tests work if treelite is not installed in the environment? If lightgbm is not available, I think the importerskip statement will just cause it to skip all tests. But if only treelite is missing, this is currently handled separately by the TREELITE_ENABLED variable. I'm not sure that case is handled by your proposed changes. I actually don't think we need to allow for the case where the env has lightgbm but not treelite, so I suggest we just change it to also just skip everything if treelite is not installed (with importerskip), and do away with the TREELITE_ENABLED variable (changing it to True in the parametrization).

Yes you're absolutely right. I have done as you suggested and restored the old test with parametrizations, updating the variable names and docstring. I've added a table with a summary of the expected output for each parameter but let me know if you think that's overkill. With the restored tests, they are all passing.

I'll test what happens when I remove the TREELITE_ENABLED parameter and see how it goes.

Have removed the TREELITE_ENABLED option now and the tests will now just skipped if treelite+tl2cgen are not installed as suggested. Great suggestion 😄

btrotta-bom · 2025-06-26T03:34:47Z

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibration.py

+
+        Tests that treelite models are loaded correctly when:
+
+        - all thresholds contain treelite model


I think this line is incorrect - it doesn't match the function name or the code below where the treelite model config is deleted.

Good point, have updated the code to reflect the docstring

codecov · 2025-07-03T00:08:38Z

Codecov Report

Attention: Patch coverage is 90.00000% with 5 lines in your changes missing coverage. Please review.

Please upload report for BASE (mobt_800_env_upgrade_feature_branch@2c9f313). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
improver/calibration/rainforest_calibration.py	90.00%	5 Missing ⚠️

Additional details and impacted files

@@                          Coverage Diff                           @@
##             mobt_800_env_upgrade_feature_branch    #2136   +/-   ##
======================================================================
  Coverage                                       ?   98.42%           
======================================================================
  Files                                          ?      139           
  Lines                                          ?    13651           
  Branches                                       ?        0           
======================================================================
  Hits                                           ?    13436           
  Misses                                         ?      215           
  Partials                                       ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This patch provides changes to the apply_rainforests_calibration plugin in order to support the IMPROVER repo environment update (see Github issue metoppv#150). Specifically, this patch sets lower bound versions for the treelite and lightgbm packages to support an up-to-date Python environment. Corresponding code changes for these version upgrades are included. Furthermore, I have added a few minor quality changes to the rainforests code - see below list. This changeset includes: * Set lower bound for treelite and lightgbm package versions * Add tl2cgen package (needed for treelite v4.0.0 and up) * Changes to rainforests calibration code to support updated package versions * Update docs to reflect new package versions * Quality of life changes to rainforests calibration code: * Ensure correct type annotations are used * Use custom exceptions where feasible * Use StrEnum for model name to constrain possible argument values * Improved error handling * Additional clarity in docstrings * Changes to rainforests calibration unit tests: * Update unit tests to support the above changes * Add additional rainforests calibration unit tests * Move some code that supports unit tests, to avoid duplication Relates to Github issue metoppv#150

btrotta-bom

I'm happy with the approach to work around the segfault issue. A couple of remaining issues as noted.

btrotta-bom · 2025-07-02T07:59:35Z

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibration.py

+@pytest.mark.parametrize(
+    "lightgbm_keys",
+    (True, False),
+    ids=["lightgbm_keys_present", "lightgbm_keys_absent"],


Great idea to add the ids here, this will help a lot when interpreting test output.

btrotta-bom · 2025-07-02T08:02:44Z

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibration.py

+    | Available        | Yes           | No            | ApplyRainForestsCalibrationTreelite |
+    | Unavailable      | Yes           | No            | ModelFileNotFoundError              |
+    | Available        | No            | No            | ModelFileNotFoundError              |
+    | Unavailable      | No            | No            | ModelFileNotFoundError              |


I like the table, makes things really clear.

btrotta-bom · 2025-07-02T08:13:06Z

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibration.py

+
+    If all thresholds contain treelite model AND the treelite module is
+    available, tl2cgen Predictor is returned, otherwise return lightgbm
+    Boosters. Checks outputs are ordered when inputs can be unordered.


Checks outputs are ordered when inputs can be unordered.

This doesn't seem to relate to anything in the code

Good point, have removed now.

btrotta-bom · 2025-07-02T08:18:42Z

improver_tests/calibration/rainforests_calibration/utils.py

+
+# Default number of threads set by lightgbm
+# See https://lightgbm.readthedocs.io/en/stable/Parameters-Tuning.html#add-more-computational-resources
+DEFAULT_NUM_THREADS = 1


The docs linked say 1 thread per CPU core, so in general the total number of threads would be greater than 1. You can get the number of available cores in python as described here https://stackoverflow.com/a/1006337

Have removed this now per your other suggestion.

btrotta-bom · 2025-07-02T08:30:41Z

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibrationLightGBM.py


 @pytest.mark.parametrize("ordered_inputs", (True, False))
 @pytest.mark.parametrize("default_threads", (True, False))
 def test__init__(


CI tests (environment b) are failing here

Sorry this was from a previous review, please disregard.

btrotta-bom · 2025-07-03T04:00:44Z

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibration.py

+    selects which class to use based on availability of modules.
+    """
+
+    def test_correct_class_when_lightgbm_available(self, monkeypatch, model_config):


I think this test and the following one are both covered by test__new__, so they can be removed.

Good pickup, have removed these duplicate tests now.

btrotta-bom · 2025-07-03T05:10:01Z

improver_tests/calibration/rainforests_calibration/test_ApplyRainForestsCalibrationLightGBM.py

            model = result.tree_models[lead_time, threshold]
            assert model.model_class == "lightgbm-Booster"
-            assert model.threads == expected_threads
+            assert model.threads == DEFAULT_NUM_THREADS


I don't think this line is actually testing anything - it works because the MockBooster is set up to have DEFAULT_NUM_THREADS, but (as noted in comment below), for a real instance of Booster the number of threads would be equal to the number of cpu cores. I don't think there's an easy way to test this - I tried to get it working but it seems that the threads property doesn't exist by default (maybe it doesn't get set if it's not specified when creating the object). So I suggest just removing this line (and then you also don't need to define DEFAULT_NUM_THREADS in utils.py)

Great idea, have removed this and the DEFAULT_NUM_THREADS variable now. In future we could use a "magic mock" or something similar but there is very little value in something like that right now.

btrotta-bom

This all looks good now. Thanks for adding the treelite acceptance tests also.

This patch updates the data for the apply-rainforests-calibration CLI acceptance tests. This is part of the IMPROVER environment upgrade. This changeset includes: Update rainforests tests KGO Add Treelite model files to facilitate testing with Treelite models Update rainforests model config to include Treelite models Rename LightGBM model files See related PR in the improver repo: metoppv/improver#2136 Relates to mobt-800

* First cut changes in preparation for environment upgrade (#2124) * Avoid cubelists of cubelists. * Replace np.product with np.prod. * Replace np.int. * Replace np.NAN with np.nan. * Replace np.NAN with np.nan. * Replace np.NaN with np.nan. * Replace np.product with np.prod. * Replace assertRaisesRegexp with assertRaisesRegex and use collections.abc.Callable instead of collections.Callable. * Simplify test in test_flatten.py to avoid cubelist within a cubelist. * Replace Cube(None) with an alternative. * Unpin environments for testing. * Remove pygam version. * Update improver_a.yml * Remove non-essential dependencies from yml files and add pins. * Minor edits following review comments. * Modify docstring to better reflect the inputs provided. * Updates to cube_combiner for new environment (#2128) * Updates to cube_combiner for new environment * Update checksums * Mobt913: Environment upgrade - Metadata (#2137) * Fix failing tests following environment change. * Handle bug where num2date is returning cftime datetime instead of python datetime where it shouldn't * Mobt915: environment upgrade - Orographic enhancement (#2138) * Cast variable to float32 to resolve environment upgrade error * Fix pre-commit reformatting * Environment upgrade: expected value (#2139) * Modify expected scalar values in unit tests to have specific float32 type to pass more rigorous numpy assert_allclose test. * Update checksums * Mobt 914 env update nowcasting (#2142) * Ensures precision of output data are float32 * Reduces precision requirement of unit tests from 7 decimal places to 6. * Fixes unit tests that were passing the wrong arguments to the methods they were testing * Developer_tools: Updates metadata interpreter to print dict-like strings from CubeAttrsDict objects (#2134) * Updates metadata interpreter to print dict-like strings from CubeAttrsDict objects * Improves imports * Simplifies conversion of cube_attrs to dict * Environment upgrade: spot data (#2141) * Change expected results in neighbour selection tied cases due to slight variation in return from coordinate transform. * Change from using numpy float type to native float type when converting type of user provided percentiles. * Fix up acceptance test which collects warnings. Update checksums. * Mobt 916 environment upgrade regrid (#2144) * Resolve failings tests following environment upgrade * Fix pre_commit requirements * Remove print statment and remove lots of unnecessarily added trailing .0s. --------- Co-authored-by: benjamin.ayliffe <benjamin.ayliffe@metoffice.gov.uk> * Upgrade utilities unit tests (#2140) * Remove test for non-Cube inputs to a CubeList, as this is no longer possible. * Unpack list for use when slicing in cube_extraction. * Edits to test_temporal.py. * Correction to datatypes within temporal_interpolation.py. * Corrections to gradient_between_vertical_levels * GAM corrections copied from #2126 for completeness. * Edits, so that load unit tests pass, although we're no longer testing the lazy loading successfully, so this may need reconsideration. * Alter ordering of bounds in mathematical_operations.py due to underlying change in iris. * Retain intended indexing behaviour within neighbourhood_tools.py by converting list to tuple. * Changes to allow more tolerance for solar interpolation tests, where data type differences can impact precision. * Update load unit tests to override iris setting that prevents lazy loading for small files. * Revert changes to solar.py, which are not required, following allowing a greater tolerance in the test_solar_interpolation unit test. * Minor edit to test_load.py * Minor edit to return a python datetime from the iris_time_to_datetime function, rather than a cftime datetime object. * Simplification to use to_real_datetime method available in iris. * Mobt906 Calibration unit tests upgrades for new environment (#2127) * Fix tests in dz_rescaling. Added re.escape() call to regex pattern match. * Update IMPROVER choose() function as numpy.lib.index_tricks.ndindex() moved to numpy.ndindex(). * Fix faulty plugin call in improver_tests/calibration/test_init.py. * Fix tests in reliability_calibration. Added re.escape() call to regex pattern match. * Fix tests in calibration/utilities. Added re.escape() call to regex pattern match. * Modify rainforests_calibration/conftest.py so that both treelite and treelite_runtime are checked for before attempting to run relevant tests, rather than only treelite. * Change expected results for failing ensemble_calibration/test_EstimateCoefficientsForEnsembleCalibration.py tests. * Add treelite_runtime module dependency to environments containing treelite as a dependency. * Revert changes related to rainforest calibration unit tests and treelite_runtime. More work is required to properly handle upgrading from treelite 3.x to 4.y. * Change expected results for 2 tests in test_CalibratedForecastDistributionParameters.py. * Revert changes to tests which were failing due to Regex pattern matching issues with numpy types being displayed. Instead, map these objects to base Python types, e.g str instead of np.str. * Environment upgrade: generate ancillaries (#2143) * Make constant float64 type where it is used to preserve original results. * Changes to orographic smoothing coefficient generation to avoid type escalation to float64. * Formatting update. * enforce data type for cloud condensation level plugins (#2146) * Acceptance test batch 2 (#2148) * Enforce float32. * Enforce input to cube.collapsed is the same type as the output. * Get dtype without using .data. * Modification to enforce the datatype within the mode_aggregator method and add an assertion to the unit tests to show that the output dtype matches the input dtype. * Acceptance test batch 5 (#2149) * Minor edit to textural.py * Move setting of data type on the threshold values into the threshold plugin. * Enforce data type (#2147) * Update environment for rainforests calibration (#2136) This patch provides changes to the apply_rainforests_calibration plugin in order to support the IMPROVER repo environment update (see Github issue #150). Specifically, this patch sets lower bound versions for the treelite and lightgbm packages to support an up-to-date Python environment. Corresponding code changes for these version upgrades are included. Furthermore, I have added a few minor quality changes to the rainforests code - see below list. This changeset includes: * Set lower bound for treelite and lightgbm package versions * Add tl2cgen package (needed for treelite v4.0.0 and up) * Changes to rainforests calibration code to support updated package versions * Update docs to reflect new package versions * Quality of life changes to rainforests calibration code: * Ensure correct type annotations are used * Use custom exceptions where feasible * Use StrEnum for model name to constrain possible argument values * Improved error handling * Additional clarity in docstrings * Changes to rainforests calibration unit tests: * Update unit tests to support the above changes * Add additional rainforests calibration unit tests * Move some code that supports unit tests, to avoid duplication * Update rainforests calibration acceptance tests and checksums Relates to Github issue #150 * Resolve calendar issues in kgos, checksums, and a depracated warning (#2151) * Acceptance test batch 11 (#2150) * Resolve calendar and float dtype issues following environment upgrade * Repeat for weighted_blending * Mobt 943 acceptance tests batch 9 (#2152) * fixes mobt_943_acceptance_tests_batch_9 following environment changes - Update calendars on KGOs - Fix returning of float32 instead of in8 for categorical inputs - Remove surprising use of iris.cube.CubeList() in acceptance test, causing test failures * Pre-commit formatting fixes * Simplify typecasting from previous commit * Remove change of checksums so left until end of all acceptance testing * Update retrieval of package version. (#2161) Co-authored-by: Gavin Evans <gavin.evans@metoffice.gov.uk> * MOBT-290 Environment upgrade: modal categories (#2157) * Accommodate type overflow in modal aggregator whilst resetting the data type at the end of the process. * Style fix * Move dtype setting out of numpy.clip call as this causes a casting error. (#2158) * Enforce float32 type for accumulated precipitation in nowcast accumulate code. (#2159) * Remove timezone delocalisation from estimate-emos-from-table CLI to allow correct filtering of pyarrow parquet files. (#2156) * Update checksums for acceptance test data following all of the environment upgrade work. (#2160) * Cloud condensation level updates for the new environment (#2168) * update checksums for cloud condensation level * Limits cloud base pressure and temperature to surface values if super-saturated --------- Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk> --------- Co-authored-by: gavinevans <gavin.evans@metoffice.gov.uk> Co-authored-by: mspelman07 <99179165+mspelman07@users.noreply.github.com> Co-authored-by: Max <max.white@metoffice.gov.uk> Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk> Co-authored-by: Ben Hooper <114418734+brhooper@users.noreply.github.com> Co-authored-by: Rachael Esler <44791931+rachaelesler@users.noreply.github.com>

* Fill value (#2129) * update save function to include fill value * update docstring * update docstring * Update tests for fill value * 2154 BUG: Convective cloud base and top can be unphysical (#2155) * Limits cloud base pressure and temperature to surface values if super-saturated * Updates checksums for revised KGO * Fix recalibration docstring (#2166) * Use np.interp instead of scipy interp1d * Revert "Use np.interp instead of scipy interp1d" This reverts commit 3ea6cd6. * fix docstring --------- Co-authored-by: Belinda Trotta <btrotta-bom@users.noreply.github.com> * Environment upgrade feature branch (#2167) * First cut changes in preparation for environment upgrade (#2124) * Avoid cubelists of cubelists. * Replace np.product with np.prod. * Replace np.int. * Replace np.NAN with np.nan. * Replace np.NAN with np.nan. * Replace np.NaN with np.nan. * Replace np.product with np.prod. * Replace assertRaisesRegexp with assertRaisesRegex and use collections.abc.Callable instead of collections.Callable. * Simplify test in test_flatten.py to avoid cubelist within a cubelist. * Replace Cube(None) with an alternative. * Unpin environments for testing. * Remove pygam version. * Update improver_a.yml * Remove non-essential dependencies from yml files and add pins. * Minor edits following review comments. * Modify docstring to better reflect the inputs provided. * Updates to cube_combiner for new environment (#2128) * Updates to cube_combiner for new environment * Update checksums * Mobt913: Environment upgrade - Metadata (#2137) * Fix failing tests following environment change. * Handle bug where num2date is returning cftime datetime instead of python datetime where it shouldn't * Mobt915: environment upgrade - Orographic enhancement (#2138) * Cast variable to float32 to resolve environment upgrade error * Fix pre-commit reformatting * Environment upgrade: expected value (#2139) * Modify expected scalar values in unit tests to have specific float32 type to pass more rigorous numpy assert_allclose test. * Update checksums * Mobt 914 env update nowcasting (#2142) * Ensures precision of output data are float32 * Reduces precision requirement of unit tests from 7 decimal places to 6. * Fixes unit tests that were passing the wrong arguments to the methods they were testing * Developer_tools: Updates metadata interpreter to print dict-like strings from CubeAttrsDict objects (#2134) * Updates metadata interpreter to print dict-like strings from CubeAttrsDict objects * Improves imports * Simplifies conversion of cube_attrs to dict * Environment upgrade: spot data (#2141) * Change expected results in neighbour selection tied cases due to slight variation in return from coordinate transform. * Change from using numpy float type to native float type when converting type of user provided percentiles. * Fix up acceptance test which collects warnings. Update checksums. * Mobt 916 environment upgrade regrid (#2144) * Resolve failings tests following environment upgrade * Fix pre_commit requirements * Remove print statment and remove lots of unnecessarily added trailing .0s. --------- Co-authored-by: benjamin.ayliffe <benjamin.ayliffe@metoffice.gov.uk> * Upgrade utilities unit tests (#2140) * Remove test for non-Cube inputs to a CubeList, as this is no longer possible. * Unpack list for use when slicing in cube_extraction. * Edits to test_temporal.py. * Correction to datatypes within temporal_interpolation.py. * Corrections to gradient_between_vertical_levels * GAM corrections copied from #2126 for completeness. * Edits, so that load unit tests pass, although we're no longer testing the lazy loading successfully, so this may need reconsideration. * Alter ordering of bounds in mathematical_operations.py due to underlying change in iris. * Retain intended indexing behaviour within neighbourhood_tools.py by converting list to tuple. * Changes to allow more tolerance for solar interpolation tests, where data type differences can impact precision. * Update load unit tests to override iris setting that prevents lazy loading for small files. * Revert changes to solar.py, which are not required, following allowing a greater tolerance in the test_solar_interpolation unit test. * Minor edit to test_load.py * Minor edit to return a python datetime from the iris_time_to_datetime function, rather than a cftime datetime object. * Simplification to use to_real_datetime method available in iris. * Mobt906 Calibration unit tests upgrades for new environment (#2127) * Fix tests in dz_rescaling. Added re.escape() call to regex pattern match. * Update IMPROVER choose() function as numpy.lib.index_tricks.ndindex() moved to numpy.ndindex(). * Fix faulty plugin call in improver_tests/calibration/test_init.py. * Fix tests in reliability_calibration. Added re.escape() call to regex pattern match. * Fix tests in calibration/utilities. Added re.escape() call to regex pattern match. * Modify rainforests_calibration/conftest.py so that both treelite and treelite_runtime are checked for before attempting to run relevant tests, rather than only treelite. * Change expected results for failing ensemble_calibration/test_EstimateCoefficientsForEnsembleCalibration.py tests. * Add treelite_runtime module dependency to environments containing treelite as a dependency. * Revert changes related to rainforest calibration unit tests and treelite_runtime. More work is required to properly handle upgrading from treelite 3.x to 4.y. * Change expected results for 2 tests in test_CalibratedForecastDistributionParameters.py. * Revert changes to tests which were failing due to Regex pattern matching issues with numpy types being displayed. Instead, map these objects to base Python types, e.g str instead of np.str. * Environment upgrade: generate ancillaries (#2143) * Make constant float64 type where it is used to preserve original results. * Changes to orographic smoothing coefficient generation to avoid type escalation to float64. * Formatting update. * enforce data type for cloud condensation level plugins (#2146) * Acceptance test batch 2 (#2148) * Enforce float32. * Enforce input to cube.collapsed is the same type as the output. * Get dtype without using .data. * Modification to enforce the datatype within the mode_aggregator method and add an assertion to the unit tests to show that the output dtype matches the input dtype. * Acceptance test batch 5 (#2149) * Minor edit to textural.py * Move setting of data type on the threshold values into the threshold plugin. * Enforce data type (#2147) * Update environment for rainforests calibration (#2136) This patch provides changes to the apply_rainforests_calibration plugin in order to support the IMPROVER repo environment update (see Github issue #150). Specifically, this patch sets lower bound versions for the treelite and lightgbm packages to support an up-to-date Python environment. Corresponding code changes for these version upgrades are included. Furthermore, I have added a few minor quality changes to the rainforests code - see below list. This changeset includes: * Set lower bound for treelite and lightgbm package versions * Add tl2cgen package (needed for treelite v4.0.0 and up) * Changes to rainforests calibration code to support updated package versions * Update docs to reflect new package versions * Quality of life changes to rainforests calibration code: * Ensure correct type annotations are used * Use custom exceptions where feasible * Use StrEnum for model name to constrain possible argument values * Improved error handling * Additional clarity in docstrings * Changes to rainforests calibration unit tests: * Update unit tests to support the above changes * Add additional rainforests calibration unit tests * Move some code that supports unit tests, to avoid duplication * Update rainforests calibration acceptance tests and checksums Relates to Github issue #150 * Resolve calendar issues in kgos, checksums, and a depracated warning (#2151) * Acceptance test batch 11 (#2150) * Resolve calendar and float dtype issues following environment upgrade * Repeat for weighted_blending * Mobt 943 acceptance tests batch 9 (#2152) * fixes mobt_943_acceptance_tests_batch_9 following environment changes - Update calendars on KGOs - Fix returning of float32 instead of in8 for categorical inputs - Remove surprising use of iris.cube.CubeList() in acceptance test, causing test failures * Pre-commit formatting fixes * Simplify typecasting from previous commit * Remove change of checksums so left until end of all acceptance testing * Update retrieval of package version. (#2161) Co-authored-by: Gavin Evans <gavin.evans@metoffice.gov.uk> * MOBT-290 Environment upgrade: modal categories (#2157) * Accommodate type overflow in modal aggregator whilst resetting the data type at the end of the process. * Style fix * Move dtype setting out of numpy.clip call as this causes a casting error. (#2158) * Enforce float32 type for accumulated precipitation in nowcast accumulate code. (#2159) * Remove timezone delocalisation from estimate-emos-from-table CLI to allow correct filtering of pyarrow parquet files. (#2156) * Update checksums for acceptance test data following all of the environment upgrade work. (#2160) * Cloud condensation level updates for the new environment (#2168) * update checksums for cloud condensation level * Limits cloud base pressure and temperature to surface values if super-saturated --------- Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk> --------- Co-authored-by: gavinevans <gavin.evans@metoffice.gov.uk> Co-authored-by: mspelman07 <99179165+mspelman07@users.noreply.github.com> Co-authored-by: Max <max.white@metoffice.gov.uk> Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk> Co-authored-by: Ben Hooper <114418734+brhooper@users.noreply.github.com> Co-authored-by: Rachael Esler <44791931+rachaelesler@users.noreply.github.com> * Environment upgrade tidy-up (#2171) * Remove references to python 3.6 and 3.7. Upgrade pre-commit CI environment to use python 3.12. Update readme environment building advice, though this may be at odds with the currently available version of improver in the conda library. Update the python version badge shown in the readme. * Test removal of non-commit-contributing author. * Fix readthedocs (#2172) * Fix to the readthedocs environment. * Use pip for cf_units as no versions beyong 2.0.1 available from conda-forge. * Use latest FRT option for cube combiner (#2174) * Add use_latest_frt option to the cube combiner. * Doc-string changes requested. * Added ruff flake8-bandit to pre-commit (#2145) * Removed bespoking of the wind gust metadata, e.g. typical gusts and extreme gusts (#2176) * Removed bespoking of the metadata, e.g. typical gusts and extreme gusts * Update to checksums following changes * Remove Max's erroneously added site-init file from this PR. --------- Co-authored-by: benjamin.ayliffe <benjamin.ayliffe@metoffice.gov.uk> * rebadge_forecasts_as_latest_cycle modified for use in the mix suite (#2177) * Modify rebadge_forecasts_as_latest_cycle function to look for and update a blend_time if it is present on the cubes being rebadged. This ensures that the blend time and forecast reference time cannot get out of step. * Add raises section to doc-string. * Bump actions/checkout from 4 to 5 (#2180) Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v4...v5) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Add in ability to return midpoint of bounds (#2173) * Add in ability to return midpoint of bounds * small updates to comments * Fix empty array check (#2179) * HailFraction handles masked CTT data as "no convection" (#2181) * Extends HailFraction unit test to show that masked CTT data is not treated as "no convection" * Fixes code to explicitly handle masked or invalid convective cloud top temperature values as non-convective. * Updates unit test to show that the test succeeds if the convective cloud top data are a masked array or not. * Renames cct variable for better consistency * Mobt775: Add SAMOS training plugins and unit tests (#2126) * Create statistical.py. Add functionality to fit GAMs using pyGAM. * Create GAMPredict class. Rename FitGAM to GAMFit. Add doc-strings and type-hinting for both classes. Add unit tests for both classes. * Fix doc-string typos. Run black to check code formatting. * Move pygam imports in to Classes/tests to reduce depenendency on this package. * Rename ensemble_calibration to emos_calibration and update all references. * Create samos_calibration.py and create TrainGAMsForSAMOS class within it. * Add tests for TrainGAMsForSAMOS plugin. Modify TrainGAMsForSAMOS plugin and add calculate_cube_statistics method. * Extended calculate_cube_statistics method to handle rolling window calculation over time coordinate. Add tests for this method. * Create functions for converting between cube and dataframe representations for SAMOS. Move generic helper functions for SAMOS unit tests into their own file. Create TrainEMOSForSAMOS class. Create additional unit tests for TrainGAMsForSAMOS. Add scipy monkey patch to pygam imports to work around a known bug. * Improve samos_calibration tests helper functions. Refactor TrainEMOSForSAMOS to use 2 methods which include all functionality. Add unit tests for TrainEMOSForSAMOS. Minor update to CalculateClimateAnomalies. * Modify CalculateClimateAnomalies plugin to correctly calculate the reference_epoch coordinate bounds when the inputs have multiple time points. * Make tests for TrainGAMsForSAMOS and TrainEMOSForSAMOS simpler to understand. Improve test of TrainGAMsForSAMOS by predicting from the fitted GAM and comparing the predictions to expected output. * Improve doc-strings and type hints. Move test helper functions to relevant file where there are only used in one file. Improve argument names for some test helper functions. Run pre-commit to fix formatting errors. * Correct filepath for EMOS in documentation. * Formatting changes. * Improvements to doc-strings and other changes following first review. * Changes following review. Largest change is addition of calculate_statistic_by_rolling_window method to TrainGAMsForSAMOS class. * Changes following review. Make calculate_statistic_by_rolling_window more robust to missing time points and allow it to handle period diagnostics. Add new tests. * Start using collapse_realizations methods in improver.utilities.cube_manipulation.py instead of iris method. Remove use of cell methods in TrainGAMsForSAMOS as they are unnecessary. * Minor changes following review. * Mark tests requiring pygam as skippable if pygam is not available in the environment. * Add type hints for apply_aggregator method of TrainGAMsForSAMOS class. * Fix gh action Sphinx-Pytest-Coverage: updated intersphinx inventory URLs (#2186) * new url * Update conf.py * Update doc/source/conf.py Co-authored-by: bayliffe <benjamin.ayliffe@metoffice.gov.uk> * Update conf.py --------- Co-authored-by: bayliffe <benjamin.ayliffe@metoffice.gov.uk> * Bump actions/stale from 9 to 10 (#2184) Bumps [actions/stale](https://github.com/actions/stale) from 9 to 10. - [Release notes](https://github.com/actions/stale/releases) - [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md) - [Commits](actions/stale@v9...v10) --- updated-dependencies: - dependency-name: actions/stale dependency-version: '10' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump actions/setup-python from 5 to 6 (#2185) Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5 to 6. - [Release notes](https://github.com/actions/setup-python/releases) - [Commits](actions/setup-python@v5...v6) --- updated-dependencies: - dependency-name: actions/setup-python dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * EPPT-2305 contrails engine mixing ratio (#2125) * Adding initial class setup for CondensationTrailFormation, with mixing ratio calculation method. * Update to CondensationTrailFormation and adding tests * Changes to my temporary return to keep tests passing * Adding missing test check for mixing ratio values, and adding actual engine factors * Allowing passing of bespoke engine_contrail_factors * Docstring change to stop Sphinx CI error * Documentation changes from review * Removing height cube requirement and adjusting test cube creation to fully utilise set_up_variable_cube * Changes from second review * EPPT 2534 Refactors CondensationTrailFormation so that the same resul… (#2130) * EPPT 2534 Refactors CondensationTrailFormation so that the same results can be achieved with Numpy arrays and Iris Cubes. * Ruff * Removes reference to Cubes in plugin doc-string * Renames new method at reviewer suggestion * Adding first derivative of saturation vapour pressure table (#2132) * First working version of the new SVP derivative code * Removing print statements and final changes following testing * Adding unit tests for the new SaturatedVapourPressureTableDerivative class * Adding extended documentation * Correcting path to the extended documentation * Adding Derivative to the call so the correct function is used * removing blank line in the main doc string * Refactors the two svp generator plugins to reduce duplication * Updating cube name * Adding three more unit tests to check that an error message is returned if the input temperatures to the saturated vapour pressure calculation are out of bounds. * Ruff changes * Changes following Marcus's code review * Changing the svp to svp_derivative to make it more meaningful. Also, correcting the values in one of the unit tests now that derivative values above the triple point have changes very slightly following the removal of a pair of brackets in the n2 calculation. * forcing a pre-commit hook * Pre-commit changes --------- Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk> * EPPT-2389: Calculation of localised svp (#2164) * Adding almost working method with confusing difference between test calculated versions * Forcing return shape of local vapour pressure * Fixing weird precision difference due to np.float32 and np.array(,dtype=np.float32) * Fixing half of problem with pressure levels test * Fixing axis problem for pressure levels by reshaping array when calculating local vapour pressure * Generalising pressure levels shape in local_vapour_pressure calculation to allow for any shape of temperature etc data to be parsed * Revert "EPPT-2389: Calculation of localised svp (#2164)" (#2169) This reverts commit 873c009. * EPPT-2510: Adding calculation of saturation vapour pressure derivative in air (#2165) * Code changes to add the SVP derivative corrected for in air * Adding a missing temperature term to the correction calculation * Updating expected values in unit tests and changing one of the imports so it correctly finds the function * Updating one of the ecpected values within the unit tests * EPPT-2389: Calculation of localised SVP (version 2) (#2170) * Adding almost working method with confusing difference between test calculated versions * Forcing return shape of local vapour pressure * Fixing weird precision difference due to np.float32 and np.array(,dtype=np.float32) * Fixing half of problem with pressure levels test * Fixing axis problem for pressure levels by reshaping array when calculating local vapour pressure * Generalising pressure levels shape in local_vapour_pressure calculation to allow for any shape of temperature etc data to be parsed * Fixing test that is broken by environment update --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: mspelman07 <99179165+mspelman07@users.noreply.github.com> Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk> Co-authored-by: Belinda Trotta <belinda.trotta@bom.gov.au> Co-authored-by: Belinda Trotta <btrotta-bom@users.noreply.github.com> Co-authored-by: bayliffe <benjamin.ayliffe@metoffice.gov.uk> Co-authored-by: gavinevans <gavin.evans@metoffice.gov.uk> Co-authored-by: Max <max.white@metoffice.gov.uk> Co-authored-by: Ben Hooper <114418734+brhooper@users.noreply.github.com> Co-authored-by: Rachael Esler <44791931+rachaelesler@users.noreply.github.com> Co-authored-by: Carwyn Pelley <carwyn.pelley@metoffice.gov.uk> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: James Canvin <41655739+nivnac@users.noreply.github.com> Co-authored-by: Robert Neal <robert.neal@metoffice.gov.uk>

rachaelesler marked this pull request as draft June 24, 2025 00:40

rachaelesler requested a review from JenniferWhelan June 24, 2025 00:41

rachaelesler commented Jun 24, 2025

View reviewed changes

rachaelesler added don't merge yet BoM review required PRs opened by non-BoM developers that require a BoM review labels Jun 24, 2025

gavinevans reviewed Jun 24, 2025

View reviewed changes

rachaelesler marked this pull request as ready for review June 24, 2025 23:05

rachaelesler added Type:TechnicalDebt test and removed don't merge yet labels Jun 24, 2025

metoppv deleted a comment from rachaelesler Jun 26, 2025

rachaelesler marked this pull request as draft June 26, 2025 05:38

btrotta-bom requested changes Jun 26, 2025

View reviewed changes

rachaelesler marked this pull request as ready for review July 2, 2025 04:19

rachaelesler requested a review from btrotta-bom July 2, 2025 04:19

rachaelesler added 12 commits July 3, 2025 12:09

Add Pandas (for CI/CD warnings)

35d9eb6

Update CONTRIBUTING.md

59466fb

Fix logic for selecting which class to use

edd3792

PR changes: docs, test types, incorrect tests

f731cde

Use typing.Literal over StrEnum

33993a0

Add parametrised treelite test back

16e7e25

Remove TREELITE_ENABLED in tests

5fe9943

Allow use of default number of threads

ae9d003

More LightGBM error handling

7dfd36c

Update tests and clarify error message

86eb9fb

Fix acceptance tests

4dc9d7d

btrotta-bom reviewed Jul 3, 2025

View reviewed changes

rachaelesler force-pushed the new-uprev-rainforests branch from 8f01fc5 to 4dc9d7d Compare July 3, 2025 05:38

rachaelesler added 3 commits July 3, 2025 16:29

Minor comment change

2a7dc51

PR changes

1c89481

Update acceptance tests and checksums

14f9edd

rachaelesler requested a review from btrotta-bom July 9, 2025 00:47

rachaelesler mentioned this pull request Jul 9, 2025

Update data for Rainforests acceptance tests metoppv/improver_test_data#95

Merged

btrotta-bom approved these changes Jul 9, 2025

View reviewed changes

rachaelesler merged commit 68e54c8 into metoppv:mobt_800_env_upgrade_feature_branch Jul 9, 2025
5 of 8 checks passed

rachaelesler deleted the new-uprev-rainforests branch July 9, 2025 03:49


		Tests that treelite models are loaded correctly when:

		- all thresholds contain treelite model

Update environment for rainforests calibration #2136

Update environment for rainforests calibration #2136

Uh oh!

Conversation

rachaelesler commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

btrotta-bom left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rachaelesler Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rachaelesler Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

btrotta-bom left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rachaelesler commented Jun 24, 2025 •

edited

Loading

rachaelesler Jun 27, 2025 •

edited

Loading

rachaelesler Jun 27, 2025 •

edited

Loading

codecov bot commented Jul 3, 2025 •

edited

Loading