Skip to content

Conversation

@Illviljan
Copy link
Contributor

@Illviljan Illviljan commented Mar 13, 2021

@pep8speaks
Copy link

pep8speaks commented Mar 13, 2021

Hello @Illviljan! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-04-15 01:45:12 UTC

@Illviljan Illviljan marked this pull request as ready for review March 14, 2021 06:56
Copy link
Collaborator

@mathause mathause left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Clever test! I think someone actually working on the backends should merge.

@Illviljan
Copy link
Contributor Author

Maybe @aurghs or @alexamici can take a look as well?

@mathause
Copy link
Collaborator

mathause commented Apr 8, 2021

@alexamici ok if we merge this?

@alexamici
Copy link
Collaborator

@Illviljan sorry for being late to the party.

The first comment is that I see the point of the feature at a theoretical level, but I'm at the third external backend that I'm writing and I didn't miss it. Adding an entrypoint is very easy and works seamlessly during development.

Do you have in mind any use case where this is more convenient than adding an entrypoint?

WRT the implementation, at the request of @shoyer and @jhamman, we ask backend developers to inherit from BackendEntrypoint, for the sake of consistency I would enforce it here as well, instead to use duck-typing.

@Illviljan
Copy link
Contributor Author

The first comment is that I see the point of the feature at a theoretical level, but I'm at the third external backend that I'm writing and I didn't miss it. Adding an entrypoint is very easy and works seamlessly during development.

Do you have in mind any use case where this is more convenient than adding an entrypoint?

I simply want to do:

from custom_backend import engine

ds = xr.load_dataset(filename, engine=engine)

That is much simpler than having to figure out what the How to register a backend is talking about. Because I'm a user who doesn't have any grand dreams (yet?) of creating public backend modules I therefore don't see the point in having to do all this extra paperwork.

... we ask backend developers to inherit from BackendEntrypoint, for the sake of consistency I would enforce it here as well, instead to use duck-typing.

That's not how I read the docs. If this is how we actually want it then some words in it should be replaced with "must" and "requires".

But I don't think that should be such a hard requirement when a user insists on using a custom engine this way. I see it as an advanced option where it's the users responsibility to make sure that the engine class has

  • the open_dataset method
  • the open_dataset_parameters attribute
  • the guess_can_open method

Which is very simple to do, see the test for an example. Subclassing using BackendEntrypoint adds complexity and readability issues so I want to at least feel a little motivated to use it but there's nothing fancy going on there, is there any plans?

@alexamici
Copy link
Collaborator

The first comment is that I see the point of the feature at a theoretical level, but I'm at the third external backend that I'm writing and I didn't miss it. Adding an entrypoint is very easy and works seamlessly during development.
Do you have in mind any use case where this is more convenient than adding an entrypoint?

I simply want to do:

from custom_backend import engine

ds = xr.load_dataset(filename, engine=engine)

That is much simpler than having to figure out what the How to register a backend is talking about. Because I'm a user who doesn't have any grand dreams (yet?) of creating public backend modules I therefore don't see the point in having to do all this extra paperwork.

Well, based on the amount of complexity a developer needs to master in order to write the backend, I would consider the registration to be a relatively trivial bit, especially because the syntax is the same as for the well known console_script.

On one hand I judge the feature to be relatively simple to support in the long long term, on the other hand I still feel that its benefit will be be quite marginal. Therefore I'm still a mild -1.

... we ask backend developers to inherit from BackendEntrypoint, for the sake of consistency I would enforce it here as well, instead to use duck-typing.

That's not how I read the docs. If this is how we actually want it then some words in it should be replaced with "must" and "requires".

But I don't think that should be such a hard requirement when a user insists on using a custom engine this way. I see it as an advanced option where it's the users responsibility to make sure that the engine class has

  • the open_dataset method
  • the open_dataset_parameters attribute
  • the guess_can_open method

Which is very simple to do, see the test for an example. Subclassing using BackendEntrypoint adds complexity and readability issues so I want to at least feel a little motivated to use it but there's nothing fancy going on there, is there any plans?

On this one I can side with you. The initial proposal from @aurghs and myself was to use a dict 😄 .

This one is a higher lever design decisione, I think @jhamman and @shoyer need to weight in.

@Illviljan
Copy link
Contributor Author

Well, based on the amount of complexity a developer needs to master in order to write the backend, I would consider the registration to be a relatively trivial bit, especially because the syntax is the same as for the well known console_script.

Making a backend doesn't have to be super difficult either depending if you already have a nice 3rd party module you can thinly wrap to return a Dataset instead of whatever is the default

It's funny how different our backgrounds are, I don't think I've had to deal with console_scripts.

@alexamici
Copy link
Collaborator

alexamici commented Apr 8, 2021

Making a backend doesn't have to be super difficult either depending if you already have a nice 3rd party module you can thinly wrap to return a Dataset instead of whatever is the default

Absolutely agree: https://github.com/corteva/rioxarray/blob/master/rioxarray/xarray_plugin.py

(the PR took a grand total of 48 hours from open to merge: corteva/rioxarray#281)

It's funny how different our backgrounds are, I don't think I've had to deal with console_scripts.

$ grep -r console_scripts develop/ | wc -l
      23

😅

@aurghs
Copy link
Collaborator

aurghs commented Apr 9, 2021

Making a backend doesn't have to be super difficult either depending if you already have a nice 3rd party module you can thinly wrap to return a Dataset instead of whatever is the default

I agree. Adding a plugin is not really very difficult, but in some cases could be discouraging especially if you are just exploring how the backends work.

@alexamici alexamici requested review from jhamman and shoyer April 12, 2021 08:46
@aurghs
Copy link
Collaborator

aurghs commented Apr 14, 2021

@Illviljan I see your point, but the subclassing doesn't add too much complexity and for consistency would be better to add a check on the class.
After that, I think we can merge it.

@shoyer
Copy link
Member

shoyer commented Apr 14, 2021

My perspective:

  • Generally I like this PR. This is a nice, more explicit alternative to registering an entry-point.
  • I do think we should require using a subclass of Xarray's BackendEntrypoint . Subclassing is a standard way to express interfaces in Python. It is less error prone than supporting a dict, for example, which can more easily have misspelled names.

@kmuehlbauer
Copy link
Contributor

I really like to see this getting merged and I think subclassing isn't a big of a problem. Thanks for working on this.

@Illviljan Illviljan closed this Apr 15, 2021
@Illviljan Illviljan reopened this Apr 15, 2021
@shoyer shoyer merged commit a3ad872 into pydata:master Apr 15, 2021
@shoyer
Copy link
Member

shoyer commented Apr 15, 2021

thanks!

@Illviljan Illviljan deleted the Illviljan-simpler_backend_load branch May 18, 2021 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Simplify adding custom backends

8 participants