Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unrequested variable in extractions that span date of variable added to dataset #51

Closed
douglatornell opened this issue Sep 14, 2022 · 0 comments · Fixed by #52
Closed
Assignees
Labels
bug Something isn't working
Milestone

Comments

@douglatornell
Copy link
Member

Discovered during investigation of issue #43.

When an extraction date range includes the date on which a variable was added to a dataset, the added variable is included in the extracted dataset regardless of whether or not it is in the list of requested variables. Known occurrences of this are:

  • HRDPS-2.5km-operational.yaml profile for percentcloud variable added to dataset on 8-Jul-2017
  • HRDPS-2.5km-operational.yaml profile for variables LHTFL_surface, PRATE_surface, and RH_2maboveground added to dataset on 6-Dec-2018

This issue is demonstrated for the 1st of those cases by the extraction config:

dataset:
  model profile: HRDPS-2.5km-operational.yaml
  time base: hour
  variables group: surface fields

dask cluster: salish_cluster.yaml

start date: 2017-07-01
end date: 2017-07-31

extract variables:
  - precip

extracted dataset:
  name: HRDPS_hour_precip_surface
  description: Hour-averaged surface precip rate extracted from HRDPS 2.5km operational product
  dest dir: /tmp/

The extracted dataset contains variables precip as requested, and percentcloud, with the latter having NaN values prior to 2017-07-08.

@douglatornell douglatornell added the bug Something isn't working label Sep 14, 2022
@douglatornell douglatornell added this to the v22.1 milestone Sep 14, 2022
@douglatornell douglatornell self-assigned this Sep 14, 2022
douglatornell added a commit that referenced this issue Sep 14, 2022
Use the 1st and last dataset paths to calculate a more complete set of all
variables in the dataset. That avoids the inclusion in the extracted dataset of
variables that were added to the dataset during the extraction time span.

Fixes issue #51
@douglatornell douglatornell linked a pull request Sep 14, 2022 that will close this issue
douglatornell added a commit that referenced this issue Sep 14, 2022
Use the 1st and last dataset paths to calculate a more complete set of all
variables in the dataset. That avoids the inclusion in the extracted dataset of
variables that were added to the dataset during the extraction time span.

Fixes issue #51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant