-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when adding a DataArray to an existing Dataset with a MultiIndex #7921
Comments
Thanks for taking the time to file a bug report!
I agree this is confusing and seems like it should work. |
@dalonsoa It would be great if you could provide a MCVE here. It makes it much easier to debug for interested parties. |
Hi @kmuehlbauer , many thanks for asking for a MCVE because, to be honest, I'm not able to reproduce the error with the following code which, I think represents the situation we have at hand. It runs beginning to end without any problem, using the same versions for import xarray as xr
import numpy as np
da1 = xr.DataArray(
np.arange(48).reshape(2, 2, 3, 4),
coords=[
("v", [10, 20]),
("x", ["a", "b"]),
("y", [0, 1, 2]),
("z", ["alpha", "beta", "gamma", "delta"]),
],
)
da1 = da1.stack(w=("x", "z", "v"))
da2 = xr.zeros_like(da1.transpose("w", "y"))
da3 = xr.zeros_like(da1)
ds = xr.Dataset({"one": da1, "two": da2, "three": da3})
ds["four"] = xr.zeros_like(ds.one)
print(ds) I'll investigate why my code is failing and this one is not. May it be the way the If anyone is interested, this is the line of the code I'm refactoring that is causing me trouble: https://github.com/SGIModel/MUSE_OS/blob/9fb62bc0c3b7adeb9ce89dce9cad4856e1082925/src/muse/examples.py#L193 |
@dalonsoa Thanks for coming back this fast. I've also no real clue where the problem lies. It might be how the MultiIndex is created, as you are suggesting. I've had a look at the tests over at your place to get an impression how things are about to work. But there are too many fixtures to quickly adapt a MCVE from that, at least for one who is not familiar with the code base. Would you be able to destill a MCVE from your test code? |
Mmm... the code is rather convoluted - trying to simplify it - but I'll try to put something simple together that uses parts of the original code and reproduces the error. Bear with me while I do that. |
I've have not forgotten about this. I've tracked where and how the timeslice MultiIndex is created and created another example that closely matches that (see below), but that one also works... The problem I have is that the process looks like: 1. So I'll keep investigating what's going on in step 2 that makes things break down the line. import pandas as pd
import xarray as xr
import numpy as np
timeslices = {
"all-year.all-week.night": 1460,
"all-year.all-week.morning": 1460,
"all-year.all-week.afternoon": 1460,
"all-year.all-week.early-peak": 1460,
"all-year.all-week.late-peak": 1460,
"all-year.all-week.evening": 1460,
}
level_names = ["month", "day", "hour"]
levels = [tuple(k.split(".")) for k in timeslices.keys()]
values = list(timeslices.values())
indices = pd.MultiIndex.from_tuples(levels, names=level_names)
timeslice = xr.DataArray(values, coords={"timeslice": indices}, dims="timeslice")
da1 = xr.DataArray(
np.arange(36).reshape(2, 3, 6),
coords=[
("x", ["a", "b"]),
("y", [0, 1, 2]),
timeslice.timeslice,
],
)
da2 = xr.zeros_like(da1.transpose("y", "x", ...))
da3 = xr.zeros_like(da1)
ds = xr.Dataset({"one": da1, "two": da2, "three": da3})
ds["four"] = xr.zeros_like(ds.one)
print(ds) |
For reference, I've narrowed down the problem to this function. The manipulations going on there result in a |
Ok, while trying to figure out what's wrong with my code above I'm finding examples that have an odd behaviour or that fail, but for a different reason. Let's take the last example but where the # as above until here
# ...
da1 = xr.DataArray(
np.arange(6).reshape(2, 3),
coords=[
("x", ["a", "b"]),
("y", [0, 1, 2]),
],
)
da1 = da1.expand_dims(dim={"timeslice": timeslice.timeslice})
print(da1) This does not add the
To get the proper da1 = da1.expand_dims(dim={"timeslice": timeslice.timeslice}).assign_coords(timeslice=timeslice.timeslice)
print(da1) Resulting in:
One would think this should be a perfectly fine DataArray, but when I do either of these things: ds = xr.Dataset({"one": da1})
ds["four"] = xr.zeros_like(ds.one) or
Things fail with:
This is not the error I was originally reporting, but goes along the same lines of having a perfectly looking array with a I will keep trying to reproduce the original error, but any suggestion of why this might be happening with an otherwise perfectly looking array will be helpful. |
@benbovy , many thanks for the fix. I was on holiday. I'll check if the original issue was also fixed by this as soon as possible, but it is great that, if nothing else, at least part of it is sorted. I'll keep you posted in case it has not been fixed. |
What is your issue?
This is a mixture between question, bug (potentially) and general issue, so feel free to label it accordingly.
Here is my question: what is the recommended approach to add a
xr.DataArray
to an existingxr.Dataset
with aMultiIndex
?To give some more context, I've a
xarray.Dataset
calledmarket
with several variables and coordinates, one of them,timeslice
, aMultiIndex
. This is what it looks like:Now, I want to add another variable, called
supply
, identical toexports
but filled with zeros. In a code that was working withxarray==2022.3.0
andpandas==1.4.4
, I was simply doing:And it worked totally fine. With the newest versions of
xarray==2023.5.0
andpandas==2.0.2
under python 3.10, this fails with:I've tried variants like:
both failing with the same message.
I totally fail to see how this process is deleting a level of the MultiIndex - or modifying the indexes in any form. Probably it is because I don't understand the inner workings of
xarray
indexes.The following works totally fine, but it is rather convoluted having to create a brand new Dataset from scratch manually, in addition to be problematic if you really want to modify the Dataset in place (same problem will have
assign
).Resulting in:
Many thanks for your support!
The text was updated successfully, but these errors were encountered: