Skip to content

Commit b2f2ad9

Browse files
authored
Time dependent QDM (#226)
* Initializing WindowedQuantileDeltaMappingCorrection * feat: window_mask() method A masking for measurements outside the intended time window of interest. * feat: window_mask() method directly on QDM A masking for measurements outside the intended time window of interest. Building it directly in the standard QDM method. * refact: run_single including bias time index A requirements to estimate parameters by time window. * feat: Quantiles estimated within moving time windows This allows monthly quantiles or any other scales. * refactor: local_qdm running on time windows Expects the CDFs were estimated within time windows and apply the QDM correction respecting those. * feat: Method to guide periods along a year This is used to support applying filters and masks. * doc: Info on assuming nearest available CDF * clean: Removing WindowedQuantileDeltaMappingCorrection A different direction. Instead of that, I generatlized the standard QuantileDeltaMappingCorrection to deal with N periods, which N=1 is the same behavior of the original function. * refactor: Using named arguments Using named arguments allow more freedom on the functions signatures thus more freedom for refactoring. Here we initiated the requirement on time to be able to apply QDM on different time scales, such as seasonally. * Adding ipython to dev environment * test: Adjusting tests to run with new temporal QDM * cfg: Updating pixi.lock * doc: window_mask() * doc, style: _window_center() * style: * style: * fix: Use nearest window center * Replacing variable t by idt As requested by @grantbuster * refact: Renaming variables As requested by @grantbuster * refact, fix: window_mask doesn't need rouding In case of a resolution higher than daily, it would make sense using the window extent without rounding it. Also fix a sign when after 365. * test: Testing window_mask() * doc: Improving window_mask() documentation * doc: Updating documentation to reflect new dimension on time window Issue noticed by @grantbuster * refact: Renaming time to time_index As requested by @bnb32. * doc: Information on the reference attributes As requested by @grantbuster.
1 parent 85637d4 commit b2f2ad9

File tree

6 files changed

+1404
-781
lines changed

6 files changed

+1404
-781
lines changed

pixi.lock

+1,015-647
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pyproject.toml

+1
Original file line numberDiff line numberDiff line change
@@ -291,3 +291,4 @@ pytest = ">=5.2"
291291
build = ">=0.6"
292292
twine = ">=5.0"
293293
ruff = ">=0.4"
294+
ipython = ">=8.0"

sup3r/bias/bias_transforms.py

+77-36
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,12 @@
11
# -*- coding: utf-8 -*-
22
"""Bias correction transformation functions."""
3+
34
import logging
45
import os
56
from warnings import warn
67

78
import numpy as np
9+
import pandas as pd
810
from rex import Resource
911
from rex.utilities.bc_utils import QuantileDeltaMapping
1012
from scipy.ndimage import gaussian_filter
@@ -13,7 +15,6 @@
1315

1416

1517
def _get_factors(lat_lon, ds, bias_fp, threshold=0.1):
16-
1718
with Resource(bias_fp) as res:
1819
lat = np.expand_dims(res['latitude'], axis=-1)
1920
lon = np.expand_dims(res['longitude'], axis=-1)
@@ -73,7 +74,7 @@ def get_spatial_bc_factors(lat_lon, feature_name, bias_fp, threshold=0.1):
7374
'adder': f'{feature_name}_adder'}
7475
out = _get_factors(lat_lon, ds, bias_fp, threshold)
7576

76-
return out["scalar"], out["adder"]
77+
return out['scalar'], out['adder']
7778

7879

7980
def get_spatial_bc_quantiles(lat_lon: np.array,
@@ -169,7 +170,7 @@ def get_spatial_bc_quantiles(lat_lon: np.array,
169170
with Resource(bias_fp) as res:
170171
cfg = res.global_attrs
171172

172-
return out["base"], out["bias"], out["bias_fut"], cfg
173+
return out['base'], out['bias'], out['bias_fut'], cfg
173174

174175

175176
def global_linear_bc(input, scalar, adder, out_range=None):
@@ -398,27 +399,31 @@ def monthly_local_linear_bc(input,
398399
return out
399400

400401

401-
def local_qdm_bc(data: np.array,
402-
lat_lon: np.array,
402+
def local_qdm_bc(data: np.ndarray,
403+
lat_lon: np.ndarray,
403404
base_dset: str,
404405
feature_name: str,
405406
bias_fp,
407+
time_index: pd.DatetimeIndex,
406408
lr_padded_slice=None,
407409
threshold=0.1,
408410
relative=True,
409-
no_trend=False):
411+
no_trend=False,
412+
):
410413
"""Bias correction using QDM
411414
412415
Apply QDM to correct bias on the given data. It assumes that the required
413416
statistical distributions were previously estimated and saved in
414417
``bias_fp``.
415418
419+
Assume CDF for the nearest day of year (doy) is representative.
420+
416421
Parameters
417422
----------
418423
data : np.ndarray
419424
Sup3r input data to be bias corrected, assumed to be 3D with shape
420425
(spatial, spatial, temporal) for a single feature.
421-
lat_lon : ndarray
426+
lat_lon : np.ndarray
422427
Array of latitudes and longitudes for the domain to bias correct
423428
(n_lats, n_lons, 2)
424429
base_dset :
@@ -428,6 +433,11 @@ def local_qdm_bc(data: np.array,
428433
Name of feature that is being corrected. Datasets with names
429434
"bias_{feature_name}_params" and "bias_fut_{feature_name}_params" will
430435
be retrieved.
436+
time_index : pd.DatetimeIndex
437+
DatetimeIndex object associated with the input data temporal axis
438+
(assumed 3rd axis e.g. axis=2). Note that if this method is called as
439+
part of a sup3r resolution forward pass, the time_index will be
440+
included automatically for the current chunk.
431441
bias_fp : str
432442
Filepath to statistical distributions file from the bias calc module.
433443
Must have datasets "bias_{feature_name}_params",
@@ -454,7 +464,9 @@ def local_qdm_bc(data: np.array,
454464
-------
455465
out : np.ndarray
456466
The input data corrected by QDM. Its shape is the same of the input
457-
(spatial, spatial, temporal)
467+
(spatial, spatial, time_window, temporal). The dimension time_window
468+
aligns with the number of time windows defined in a year, while
469+
temporal aligns with the time of the data.
458470
459471
See Also
460472
--------
@@ -467,13 +479,13 @@ def local_qdm_bc(data: np.array,
467479
be related to the dataset used to estimate
468480
"bias_fut_{feature_name}_params".
469481
470-
Keeping arguments consistent with `local_linear_bc()`, thus a 3D data
471-
(spatial, spatial, temporal), and lat_lon (n_lats, n_lons, [lat, lon]).
472-
But `QuantileDeltaMapping()`, from rex library, expects an array,
473-
(time, space), thus we need to re-organize our input to match that,
474-
and in the end bring it back to (spatial, spatial, temporal). This is
475-
still better than maintaining the same functionality consistent in two
476-
libraries.
482+
Keeping arguments as consistent as possible with `local_linear_bc()`, thus
483+
a 4D data (spatial, spatial, time_window, temporal), and lat_lon (n_lats,
484+
n_lons, [lat, lon]). But `QuantileDeltaMapping()`, from rex library,
485+
expects an array, (time, space), thus we need to re-organize our input to
486+
match that, and in the end bring it back to (spatial, spatial, time_window,
487+
temporal). This is still better than maintaining the same functionality
488+
consistent in two libraries.
477489
478490
Also, :class:`rex.utilities.bc_utils.QuantileDeltaMapping` expects params
479491
to be 2D (space, N-params).
@@ -488,35 +500,64 @@ def local_qdm_bc(data: np.array,
488500
>>> unbiased = local_qdm_bc(biased_array, lat_lon_array, "ghi", "rsds",
489501
... "./dist_params.hdf")
490502
"""
503+
# Confirm that the given time matches the expected data size
504+
assert (
505+
data.shape[2] == time_index.size
506+
), 'Time should align with data 3rd dimension'
507+
491508
base, bias, bias_fut, cfg = get_spatial_bc_quantiles(lat_lon,
492509
base_dset,
493510
feature_name,
494511
bias_fp,
495512
threshold)
513+
496514
if lr_padded_slice is not None:
497515
spatial_slice = (lr_padded_slice[0], lr_padded_slice[1])
498516
base = base[spatial_slice]
499517
bias = bias[spatial_slice]
500518
bias_fut = bias_fut[spatial_slice]
501519

502-
if no_trend:
503-
mf = None
504-
else:
505-
mf = bias_fut.reshape(-1, bias_fut.shape[-1])
506-
# The distributions are 3D (space, space, N-params)
507-
# Collapse 3D (space, space, N) into 2D (space**2, N)
508-
QDM = QuantileDeltaMapping(base.reshape(-1, base.shape[-1]),
509-
bias.reshape(-1, bias.shape[-1]),
510-
mf,
511-
dist=cfg['dist'],
512-
relative=relative,
513-
sampling=cfg["sampling"],
514-
log_base=cfg["log_base"])
515-
516-
# input 3D shape (spatial, spatial, temporal)
517-
# QDM expects input arr with shape (time, space)
518-
tmp = data.reshape(-1, data.shape[-1]).T
519-
# Apply QDM correction
520-
tmp = QDM(tmp)
521-
# Reorgnize array back from (time, space) to (spatial, spatial, temporal)
522-
return tmp.T.reshape(data.shape)
520+
output = np.full_like(data, np.nan)
521+
nearest_window_idx = [
522+
np.argmin(abs(d - cfg['time_window_center']))
523+
for d in time_index.day_of_year
524+
]
525+
for window_idx in set(nearest_window_idx):
526+
# Naming following the paper: observed historical
527+
oh = base[:, :, window_idx]
528+
# Modeled historical
529+
mh = bias[:, :, window_idx]
530+
# Modeled future
531+
mf = bias_fut[:, :, window_idx]
532+
533+
# This satisfies the rex's QDM design
534+
if no_trend:
535+
mf = None
536+
else:
537+
mf = mf.reshape(-1, mf.shape[-1])
538+
# The distributions at this point, after selected the respective
539+
# time window with `window_idx`, are 3D (space, space, N-params)
540+
# Collapse 3D (space, space, N) into 2D (space**2, N)
541+
QDM = QuantileDeltaMapping(oh.reshape(-1, oh.shape[-1]),
542+
mh.reshape(-1, mh.shape[-1]),
543+
mf,
544+
dist=cfg['dist'],
545+
relative=relative,
546+
sampling=cfg['sampling'],
547+
log_base=cfg['log_base'],
548+
)
549+
550+
subset_idx = nearest_window_idx == window_idx
551+
subset = data[:, :, subset_idx]
552+
# input 3D shape (spatial, spatial, temporal)
553+
# QDM expects input arr with shape (time, space)
554+
tmp = subset.reshape(-1, subset.shape[-1]).T
555+
# Apply QDM correction
556+
tmp = QDM(tmp)
557+
# Reorgnize array back from (time, space)
558+
# to (spatial, spatial, temporal)
559+
tmp = tmp.T.reshape(subset.shape)
560+
# Position output respecting original time axis sequence
561+
output[:, :, subset_idx] = tmp
562+
563+
return output

0 commit comments

Comments
 (0)