Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

return type of cell_centers #83

Open
keewis opened this issue Oct 24, 2024 · 1 comment
Open

return type of cell_centers #83

keewis opened this issue Oct 24, 2024 · 1 comment

Comments

@keewis
Copy link
Collaborator

keewis commented Oct 24, 2024

Right now, Dataset.dggs.cell_centers returns a Dataset containing the geographical coordinates. However, since we return a Dataset we have to use the lengthy ds.pipe(lambda ds: ds.merge(ds.dggs.cell_centers())) to assign the result to the original object (and this won't even work for DataArray).

Another option would be ds.pipe(lambda ds: ds.assign_coords(ds.dggs.cell_centers().coords)), which at least works for DataArray objects, but otherwise is slightly longer than the code with merge.

Ideally, we'd allow something like this:

ds.assign_coords(lambda ds: ds.dggs.cell_centers())

For this to work, ds.dggs.cell_centers() would have to return a Coordinates object (or a dict, or we make the geographic coordinates data variables), and assign_coords would have to accept a Callable[[Dataset | DataArray], dict | Coordinates].

What do you think, @benbovy?

@benbovy
Copy link
Member

benbovy commented Oct 24, 2024

My 2 cents:

I think it would be nice if we can have a consistent API for all cell geometry extraction methods, i.e., something like this:

# return cell centers as (shapely) point geometries
ds.dggs.cell_centers() -> xr.DataArray

# return cell centers as lat/lon coordinates 
ds.dggs.cell_centers_latlon() -> tuple[xr.DataArray, xr.DataArray]

# return cell boundaries as (shapely) polygon geometries
ds.dggs.cell_boundaries() -> xr.DataArray
  • always use DataArray as the xarray object return type (maybe within tuple or dict)
  • cell_xxx() would always return shapely geometries
  • cell_centers_latlon() as an additional but common special case
    • _latlon is also used in other xdggs method names
    • a tuple of DataArrays seems the simplest and the most intuitive here to me. This is what I would expect from most Python scientific libraries. A dictionary repeats the information that we already have in the DataArrays (i.e., their names) and we may not want to be too opinionated about the names of the lat-lon coordinates.

Personally I wouldn't mind doing an extra step to assign lat-lon coordinates:

cell_lat, cell_lon = ds.dggs.cell_centers_latlon()
ds.assign_coords(latitude=cell_lat, longitude=cell_lon)

If we want a convenient, functional-style friendly way to achieve this, maybe assign_coords could also accept Iterable[DataArray]?

# reuse the names of the two DataArrays returned by ds.dggs.cell_centers_latlon()
ds.assign_coords(ds.dggs.cell_centers_latlon())

Actually, if we can assign coordinates like above the dggs.assign_latlon_coords() method may not add a lot of value so I'd be +1 to remove it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants