-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xarray / vtk integration #4470
Comments
I'm really excited about this effort, thanks @rabernat! One thing I'd like to do before I forget is mention a project from the Software Underground (SWUNG) to create a new Over there, we are working to create a set of low-level data structures for sharing spatial data between various subsurface modeling/analysis software (e.g. GemPy, SimPEG, PyGIMLi, Fatiando, Welly, and more) with close ties to PyVista (VTK) for 3D visualization. At present, we are using xarray as the base data container for structured data types. This morning, @rabernat gave me some great insight into how xarray can be used to contain unstructured data as well and so I will be investigating how to do that so we leverage xarray as the base data container all around in This would be a proof of concept for the round-trip data interoperability going between xarray and VTK leaving us with some good lessons on what the value would be of a direct interface between VTK and xarray for better handling of large data and leveraging Dask. Either way, it'd be awesome to have the |
Thank @banesullivan to bring us to the loop. Is there some example somewhere using xarray for unstructured data or that is exactly what are you going to try next? |
You can see an example of using xarray with structured curvilinear coordinates here:
And with unstructured data here:
The key concept is that xarray supports both dimensions coordinates and non-dimension coordinates. The dimension coordinates must be 1D, but the non-dimension coordinates can have any dimensionality. For a regular lat-lon grid, a variable might have dimensions like this
A structured curvilinear 2D grid might instead look like
with additional coordinate variables
which can be used for visualization (but not, currently, for indexing)
This is exactly what netCDF does to encode these data types. Anything that can go into a netCDF file can be represented in Xarray. You just don't get the full functionality in terms of label-based selection. That will hopefully change as we implement more flexible indexing (see https://github.com/pydata/xarray/projects/1). Another limitation of xarray is that it has no explicit notion of "cell bounds," other than recognizing these as coordinates (see #2844). Our xgcm package works around this limitation in some simple ways. |
A key point I forgot to make...if downstream packages or accessor implementers know how to do something useful with these extra coordinates, they are free to do so! The data are there...xarray just doesn't currently make much use of them. |
FWIW, we are working on the https://github.com/ESM-VFC/xoak package to easily index and select unstructured data in xarray datasets. This works well with multi-dimensional coordinates and xarray's advanced indexing features. Support for custom indexes and chunked (dask) coordinates is coming up too! |
Just a note that the only unstructured grid (triangular mesh) example I have is: I figured out how to make that notebook from the info at: The "earthsim" project was developed by the Holoviz team (@jbednar & co) funded by USACE when @dharhas was there. Would be cool to revive this. The Holoviz team and USACE might not have been aware of the UGRID conventions when they developed that code, so currently it's a bit awkward to go from a UGRID-compliant NetCDF dataset to visualization with Holoviz (as you can see from the Hurricane Ike notebook). That would be low-hanging fruit for any future effort. |
The same dataset as in that Visualizing_Meshes notebook is also shown in https://examples.pyviz.org/bay_trimesh, which is more self-contained and may be a better starting point. Also note that VTK and pyvista are well supported by Panel, as illustrated in various examples at https://panel.holoviz.org/gallery. It would be great to update those examples to include xarray data sources to more fully capture the pipeline from data to display. I'm not sure what the specific pain points are with UGRID or what specifically is being requested at holoviz-topics/EarthSim#326, but we'd be happy to have examples either in EarthSim or at examples.pyviz.org that start with UGRID datasets and go via Xarray to either 2D or VTK-based 3D visualizations. |
I just saw this very cool tweet about ipyvista / iris integration and it reminded me of this thread. Are there any clear steps we can take to help advance the vtk / pyvista / xarray integration further? |
Ryan what would you start with? I had a use case this week. |
And again mesh to data array and back, by variety? |
I think a first step would be to make a data accessor between xarray and VTK/PyVista in accordance with https://xarray.pydata.org/en/stable/internals/extending-xarray.html This keeps coming up again and again for me. @RichardScottOZ, if you're interested in kicking this off, I'd be happy to chat and help implement this and then outline some future direction for this sort of "integration" |
Yes, that sounds good. Definitely be useful to put this in some workflows. |
I use this sort of thing a lot, anyway. https://banesullivan.com/pyvista/examples/raster.html#sphx-glr-pyvista-examples-raster-py |
So reversing that would require storing the metadata etc. and whatever the meshgrid reversal is? |
e.g. for a Uniform Grid data = result['density'].reshape(result.dimensions[2],result.dimensions[1],result.dimensions[0])
XC = [result.bounds[0]+200*i for i in range(result.dimensions[0])]
YC = [result.bounds[2]+200*i for i in range(result.dimensions[1])]
ZC = [result.bounds[4]+200*i for i in range(result.dimensions[2])]
da = xr.DataArray(data=data, dims=["z", "y","x"], coords={"z": ZC, "y": YC, "x": XC}) |
I am very interested in this sort of functionality as an xarray accessor. If I can help in any way, please let me know. Ideally this work would come in very handy to visualize Oxygen Minimum Zones in the global ocean as isosurfaces of a 3D oxygen array. |
Ok, so a discussion:- |
Just have to find some time to start. |
Following up to mention that I've developed This is very much in its early stages and I’m hoping to take it quite a bit further. But first, I’d love to gather feedback and find how it works or doesn’t work for the types of data folks have. At present, this accessor is specifically for rectilinear-style grids but if you have some other types of data, do share so that I can work towards supporting it in (there is limited support for |
do you want to add this to the ecosystem page in the docs? |
I'd love to! Thanks for the suggestion @keewis! I'll track this in pyvista/pyvista-xarray#21 to contribute after I make my next round of improvements to |
I just had a great chat with @aashish24 and @banesullivan of Kitware about how we could improve interoperability between xarray and the VTK stack They also made me aware of pyvista, which looks very cool. As a user of both tools, I can see it would be great if I could quickly drop into VTK from xarray for advanced 3D visualization.
A low-hanging fruit would be to simply be able to round-trip data between vtk and xarray in memory, much like we do with pandas. This should be doable because vtk already has a netCDF file reader. Rather than reading the data from a file, vtk could initialize its objects from an xarray dataset which, in principle, should contain all the same data / metadata
Beyond this, there are many possibilities for deeper integration around the treatment of finite-volume cells, structured / unstructured meshes, etc. Possibly related to #4222.
I just thought I would open this issue to track the general topic of xarray / vtk integration.
The text was updated successfully, but these errors were encountered: