Skip to content

Indexing within indicators#934

Merged
aulemahal merged 18 commits intomasterfrom
indexing-indicator
Nov 26, 2021
Merged

Indexing within indicators#934
aulemahal merged 18 commits intomasterfrom
indexing-indicator

Conversation

@aulemahal
Copy link
Copy Markdown
Collaborator

@aulemahal aulemahal commented Nov 24, 2021

Pull Request Checklist:

  • This PR addresses an already opened issue (for bug fixes / features)
  • Tests for the changes have been added (for bug fixes / features)
  • Documentation has been added / updated (for bug fixes / features)
  • HISTORY.rst has been updated (with summary of main changes)
    • Link to issue (:issue:number) and pull request (:pull:number) has been added.
  • bumpversion patch has been called on this branch
  • The relevant author information has been added to .zenodo.json

What kind of change does this PR introduce?

  • Creates another base Indicator class, ResamplingIndicatorWithIndexing, that injects the indexer kwargs, as it was done in some "stats" function. The indexing is done in the preprocess_and_check function, so the indice (compute) receives an input where values outside the select period are masked.
  • Expand generic.select_time with doy_bounds and date_bounds. The first takes in bounds as day-of-year integers and the second takes in dates in the month-day format (%m-%d).

Does this PR introduce a breaking change?

  • Indexing now masks values by default, instead of dropping them. There are issues when values are dropped, some indices do not expect those gaps. The principal culprit is units.rate2amount which converts by multiplying the rate by the timestep length, which is inferred from the coordinate itself. In case of a gap, the rate is assumed constant on the whole gap...
  • Most indicators now have the new **indexer in their signature.

Other information:

An indice may add an **indexer var_keyword and handle the time selecting itself. If the indicator is constructed from ResamplingIndicator (the one that doesn't inject this new param), the missing values handling will still understand the passed indexing parameter. This can be useful when the order between indexing and computing is important (like for #884 and many others).
This means that the order is currently "wrong" for many cases: i.e. indexing is done before computation. In a way, if it is made clear what the default order is, I think it's ok to have this caveat temporarily. But should we:

  1. Implement these cases now, in this PR.
  2. Let them be wrong for now and correct them later.
  3. Revert injecting **indexer everywhere and slowly implement it where it makes most sense afterwards.
    ?

- Added ResamplingIndicatorWithIndexing object
- Refac Indicator to keep trace of injected parameter's metadata
- Refac Indicator to remove explicit mentions of "indexer"
Comment thread xclim/core/indicator.py Outdated
mask = reduce(np.logical_or, miss)
# Reduce by or and broadcast to ensure the same length in time
# When indexing is used and there are no valid points in the last period, mask will not include it
mask = reduce(np.logical_or, miss).reindex_like(outs[0], fill_value=True)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case where the last period contains no points after indexing, it doesn't appear in mask and so it was dropped from the result, which is a breaking change in a way.
However, this reindex_like trick adds complexity, but fixes the problem.

I would prefer a breaking change than this.

Ex: If you do something with a time series from Jan 1st to Dec 31st, then pass freq='AS=JUL' and indexer : month=2. The second period, is completely dropped from the mask as it contains no points. Before, it wasn't dropped, rather it was masked out.

Comment thread HISTORY.rst Outdated
Comment thread xclim/core/indicator.py Outdated
Comment thread xclim/core/indicator.py Outdated
Comment thread xclim/indices/generic.py Outdated
Comment thread xclim/indices/generic.py Outdated
Comment thread xclim/indices/generic.py Outdated
@coveralls
Copy link
Copy Markdown

coveralls commented Nov 26, 2021

Pull Request Test Coverage Report for Build 1508128835

  • 215 of 222 (96.85%) changed or added relevant lines in 12 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.05%) to 90.318%

Changes Missing Coverage Covered Lines Changed/Added Lines %
xclim/core/indicator.py 67 68 98.53%
xclim/indices/generic.py 37 43 86.05%
Totals Coverage Status
Change from base Build 1507765283: 0.05%
Covered Lines: 12789
Relevant Lines: 14160

💛 - Coveralls

Comment thread xclim/core/indicator.py Outdated
@aulemahal aulemahal merged commit abc13c4 into master Nov 26, 2021
@aulemahal aulemahal deleted the indexing-indicator branch November 26, 2021 16:10
raquelalegre added a commit to UCL/xclim that referenced this pull request Nov 30, 2021
…s/heat_index

* 'indices/heat_index' of github.com:UCL/xclim: (42 commits)
  Added french metadata for heat index.
  Apply changes to fraction indicator
  [sdba] Re-write of Extreme's adjustment (Ouranosinc#914)
  update history
  Update history
  [Ouranosinc#931] Fix indicator for Rxxp index
  Add missing type for `threshold_count`
  Add dev notes - add indexer example - other small changes
  run all available tests in one call for slow builds
  update makefile target black, only install coveralls in tox if coveralls is needed
  Indexing within indicators (Ouranosinc#934)
  Update history
  finish needs doctests
  Remove unnecessary keep_attrs
  Adapt to CF conventions 1.9 - remove gregorian calendar (Ouranosinc#935)
  Bump version: 0.31.2-beta → 0.31.3-beta
  update HISTORY.rst
  suggested corrections
  Apply suggestions from code review
  Resampling indicator (Ouranosinc#927)
  ...
raquelalegre added a commit to UCL/xclim that referenced this pull request Nov 30, 2021
…es/wetday_prop

* 'indices/wetday_prop' of github.com:UCL/xclim: (46 commits)
  Add moi-meme as 0.32 contributor
  do not redefine builtin next
  use array_almost_equal in tests
  Update history
  Improve quantile function
  Apply changes to fraction indicator
  [sdba] Re-write of Extreme's adjustment (Ouranosinc#914)
  update history
  Update history
  [Ouranosinc#931] Fix indicator for Rxxp index
  Add missing type for `threshold_count`
  Add dev notes - add indexer example - other small changes
  run all available tests in one call for slow builds
  update makefile target black, only install coveralls in tox if coveralls is needed
  Indexing within indicators (Ouranosinc#934)
  Update history
  finish needs doctests
  Remove unnecessary keep_attrs
  Adapt to CF conventions 1.9 - remove gregorian calendar (Ouranosinc#935)
  Bump version: 0.31.2-beta → 0.31.3-beta
  ...
@aulemahal aulemahal mentioned this pull request Dec 1, 2021
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Masking and indexing as argument of indicator's call

3 participants