Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExperimentAxisQuery uses the thread pool from ContextBase #184

Merged
merged 8 commits into from
Feb 12, 2024

Conversation

ebezzi
Copy link
Member

@ebezzi ebezzi commented Jan 10, 2024

The only requirement for somacore is that it should contain a threadpool.
"""

_threadpool: futures.ThreadPoolExecutor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused which values are required and which are optional (aka could be None).

As coded in query.py, it appears that all objects must have a context, but a context may optionally have a threadpool. I.e., _threadpool may equal to None. But this code declares the type as non-optional.

Suggest clarifying what is optional and what is required, and having the types and tests match.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I switched the context to be Optional (as it is required by the definition of Experiment). For the threadpool, it can be either. The current implementation forces the threadpool to be not None, but I think it makes sense to leave that Optional as well, in case any other implementation prefers not to provide a thread pool. Regardless, the ExperimentAxisQuery code has a fallback path that manages its own threadpool in case either the context or the pool is missing.

"""
Returns the threadpool provided by the experiment's context.
If not available, creates a thread pool just in time."""
if self.experiment.context._threadpool is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Experiment-ish protocol types _threadpool as a required value, i.e., won't be None. Which is correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above - I put all of them as Optional.

Copy link
Member

@bkmartinjr bkmartinjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One typing nit - otherwise looks good.

@ebezzi ebezzi requested a review from bkmartinjr January 10, 2024 21:48
@thetorpedodog
Copy link
Contributor

If we're going to do this, we should also change the places where it is accepted in parameters / returned from accessors to be this base context protocol.

I also feel like the threadpool member should be public (i.e., un-prefixed), since it is intended for semi-public consumption (within SOMA libraries, but still).

@ebezzi
Copy link
Member Author

ebezzi commented Jan 10, 2024

If we're going to do this, we should also change the places where it is accepted in parameters / returned from accessors to be this base context protocol.

I also feel like the threadpool member should be public (i.e., un-prefixed), since it is intended for semi-public consumption (within SOMA libraries, but still).

Good points. I implemented both. Let me know if you know other usages I missed, and if you want a different naming/location for ContextBase.

Copy link
Contributor

@thetorpedodog thetorpedodog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about this a bit more and realized some annoying caveats (which exist for legitimate reasons) that would come from the original way I suggested doing this.

@@ -10,6 +10,7 @@
from typing_extensions import LiteralString, Self

from . import options
from .types import ContextBase
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: in somacore import modules, not things inside modules, so from . import types and use types.ContextBase below

@@ -24,7 +25,7 @@ def open(
uri: str,
mode: options.OpenMode = "r",
*,
context: Optional[Any] = None,
context: Optional[ContextBase] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this a little further and it turns out there is a deeper wrinkle to this: we wouldn’t be able to override

class SOMAObject(...):
  @classmethod
  def open(..., context: Optional[ContextBase] = None) -> Self: ...

with

class ImplObject(somacore.SOMAObject):
  @classmethod
  def open(..., context: Optional[ImplContext] = None) -> Self: ...
  # not allowed, because `ImplContext` is a narrowing of `ContextBase`,
  # which would violate the Liskov principle:
  # https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides

We could do a bunch of chicanery with another entry in the list of generics, so it would look something like:

_CtxT = TypeVar("_CtxT", bound=ContextBase)
class SOMAObject(Generic[_CtxT]):
  def open(..., context: Optional[_CtxT] = None) -> Self: ...

# elsewhere:

class ImplObject(somacore.SOMAObject[ImplContext]):
  def open(..., context: Optional[ImplContext] = None) -> Self: ...

but that would also require adding a _CtxT to every SOMA object, which would be particularly unwieldy on the already highly-genericised Measurement type:

class Measurement(
collection.BaseCollection[_RootSO],
Generic[_DF, _NDColl, _DenseNDColl, _SparseNDColl, _RootSO],
):

so instead, I think the easier thing to do is to opt out of type-checking for this parameter when being passed in, which, while imperfect, will allow it to be overridden with type information in the implementation. So the end result looks like:

class SOMAObject(metaclass=abc.ABCMeta):
  @classmethod
  def open(..., context: Optional[Any] = None) -> Self:
    ...
  # Ignore type-checking the specific type allowed in `open`

  @property
  def context(self) -> Optional[types.ContextBase]: ...
  # Returning a subclass is OK, though.

Then, at the implementation, we can say:

class ImplObject(SOMAObject):
  @classmethod
  def open(..., context: Optional[ImplContext] = None) -> Self: ...
  # Allowed since the superclass `context` parameter type is unchecked.

  @property
  def context(self) -> Optional[ImplContext]: ...
  # Allowed since you can narrow returned types.

Comment on lines 55 to 57
def exists(
cls, uri: str, *, context: Optional[ContextBase] = None
) -> Literal[False]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In light of the above, this can be left as context: Any = None (i.e. no changes to this file at all).

Comment on lines 595 to 596
self.experiment.context is not None
and self.experiment.context.threadpool is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since context and context.threadpool will be nonzero if they are set at all, this can be

context = self.experiment.context
if context and context.threadpool:
  return context.threadpool

without the need for is not None.

Comment on lines 83 to 84
If a threadpool is specified as part of the context, it will be used by the
implementer. Otherwise, the implementer will use its own threadpool.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"it will be used by experiment queries", maybe? the implementer would be the one providing it

Copy link
Contributor

@thetorpedodog thetorpedodog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for the bump.

Copy link
Member

@bkmartinjr bkmartinjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ebezzi ebezzi merged commit 9b282eb into main Feb 12, 2024
6 checks passed
@ebezzi ebezzi deleted the ebezzi/threadpool-context branch February 12, 2024 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants