Skip to content

Conversation

@johnomotani
Copy link
Contributor

While looking at #4009, I noticed that the combine_attrs kwarg to concat or merge only affects the global attrs of the Dataset, not the attrs of DataArray or Variable members of the Datasets being combined. I think this is a bug.

So far this PR adds tests that reproduce the issue in #4009, and the issue described above. Fixing should be fairly simple: for #4009 pass combine_attrs="drop" to combine_by_coords and combine_nested in open_mfdataset; for this issue insert the combine_attrs handling in an appropriate place - possibly in merge.unique_variable. I'll update with fixes when I get a chance.

Need to pass combine_attrs="drop", to allow attrs_file to set the attrs.
Comment on lines +2670 to +2683
def test_open_mfdataset_dataarray_attr_by_coords(self):
"""
Case when an attribute of a member DataArray differs across the multiple files
"""
with self.setup_files_and_datasets() as (files, [ds1, ds2]):
# Give the files an inconsistent attribute
for i, f in enumerate(files):
ds = open_dataset(f).load()
ds["v1"].attrs["test_dataarray_attr"] = i
ds.close()
ds.to_netcdf(f)

with xr.open_mfdataset(files, combine="by_coords", concat_dim="t") as ds:
assert ds["v1"].test_dataarray_attr == 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomNicholas do you think we should make this test pass (i.e. support combining attrs on DataArrays contained in the Dataset being loaded)?

Proposal for a fix:

  • Support open_mfdataset-style attribute combining by upgrading the combine_attrs argument to combine_nested/combine_by_coords (and probably the other functions that have a combine_attrs argument) so it can be passed one of the Datasets or DataArrays being combined/merged, and will then take the attrs from that.
  • In open_mfdataset use the attrs_file argument to get the Dataset to take the attrs from, and pass that to the combine_attrs argument of combine_nested/combine_by_coords.

@dcherian dcherian mentioned this pull request May 26, 2020
23 tasks
@johnomotani johnomotani mentioned this pull request Jun 24, 2020
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incoherencies between docs in open_mfdataset and combine_by_coords and its behaviour.

1 participant