performance of `is_type` #86

ggoretkin-bdai · 2023-06-20T23:38:56Z

In am interested in reducing the performance penalty of using plum in my application vs dispatching "manually" with an if/elif isinstance chain. Profiling (flamegraph below) reveals a majority of time spent in is_type:

plum/plum/parametric.py

Lines 282 to 298 in b38a162

    
           def is_type(x): 
        
               """Check whether `x` is a type or a type hint. 
        
               Under the hood, this attempts to construct a :class:`beartype.door.TypeHint` from 
        
               `x`. If successful, then `x` is deemed a type or type hint. 
        
               Args: 
        
                   x (object): Object to check. 
        
               Returns: 
        
                   bool: Whether `x` is a type or a type hint. 
        
               """ 
        
               try: 
        
                   TypeHint(x) 
        
                   return True 
        
               except BeartypeDoorNonpepException: 
        
                   return False

.

For my specific microbenchmark, is_type is only ever called with ONE value, a plum.parametric.Val[example_py_multiple_dispatch.zoo.MyType]().

Any ideas for improving the performance? Are there any correctness issues with memoizing the result of is_type? I see that there is memoization happening at lower levels via https://github.com/beartype/beartype/blob/49560b8d4d788d279a02283d8da995895440554c/beartype/_util/cache/utilcachecall.py .

The text was updated successfully, but these errors were encountered:

wesselb · 2023-06-26T15:40:35Z

Hey @ggoretkin-bdai!

If you would like the best performance, the simplest way is to only use so-called faithful types. I'm actually thinking that Val is a faithful type, so simply setting Val.__faithful__ = True could solve this without correctness issues?!

Here's a small benchmark:

In [20]: from plum import dispatch, Val, clear_all_cache

In [21]: @dispatch
    ...: def f(x: Val[1]):
    ...:     pass
    ...:

In [22]: x = Val(1)

In [23]: %timeit f(x)  # Not faithful...
5.12 µs ± 106 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [24]: Val.__faithful__ = True

In [25]: clear_all_cache()

In [26]: %timeit f(x)  # Faithful!
554 ns ± 9.35 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Would you be able to try this on your end, to see if this works and things still run correctly?

ggoretkin-bdai · 2023-06-26T15:56:41Z

That would be great! I will give this a shot and report back.

For now, I have a few questions. typing.Literal is an immediate example of a non-faithful type, and Val seems similar to typing.Literal. But the definition of "faithful type" does not seem to give a definite answer on either Literal nor Val. I understand the definition to be:

def is_faithful(type_, for_all_x):
    for x in for_all_x:
        if isinstance(x, type_) == issubclass(type(x), type_):
            pass
        else:
            return False
    return True

In [22]: is_faithful(int, [1, 42, 1234])
Out[22]: True


In [23]: is_faithful(typing.Literal[1], [1, 42, 1234])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[23], line 1
----> 1 is_faithful(typing.Literal[1], [1, 42, 1234])

Cell In[21], line 3, in is_faithful(type_, for_all_x)
      1 def is_faithful(type_, for_all_x):
      2     for x in for_all_x:
----> 3         if isinstance(x, type_) == issubclass(type(x), type_):
      4             pass
      5         else:

File /usr/lib/python3.10/typing.py:994, in _BaseGenericAlias.__instancecheck__(self, obj)
    993 def __instancecheck__(self, obj):
--> 994     return self.__subclasscheck__(type(obj))

File /usr/lib/python3.10/typing.py:997, in _BaseGenericAlias.__subclasscheck__(self, cls)
    996 def __subclasscheck__(self, cls):
--> 997     raise TypeError("Subscripted generics cannot be used with"
    998                     " class and instance checks")

TypeError: Subscripted generics cannot be used with class and instance checks

In [24]: is_faithful(plum.Val(1), [1, 42, 1234])
plum.parametric.Val[1]()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[24], line 1
----> 1 is_faithful(plum.Val(1), [1, 42, 1234])

Cell In[21], line 3, in is_faithful(type_, for_all_x)
      1 def is_faithful(type_, for_all_x):
      2     for x in for_all_x:
----> 3         if isinstance(x, type_) == issubclass(type(x), type_):
      4             pass
      5         else:

TypeError: isinstance() arg 2 must be a type, a tuple of types, or a union

wesselb · 2023-06-26T16:15:25Z

Perhaps the definition is better understood in terms of natural language.

Consider, for example, Literal[1]. Then isinstance(x, Literal[1]) if and only if x == 1. (Note that isinstance(x, Literal[1]) won't actually run, because isinstance doesn't work with generics, but let's suppose that it would.) On the other hand, for x = 1, we have type(x) == int, so issubclass(type(x), Literal[1]) == issubclass(int, Literal[1]), which is false, because int is more general than Literal[1]. (Again, note that isinstance(int, Literal[1]) won't actually run, but let's suppose that it would check whether int is a subtype of Literal[1]). Therefore, isinstance(x, Literal[1]) and issubclass(type(x), Literal[1]) are unequal for x = 1, so Literal[1] is unfaithful.

What faithful measures is whether taking type of x retains enough information to still check whether isinstance(x, Literal[1]) or not. Since type(x) == int, if we know type(x), we only know that x is an integer. However, to check whether isinstance(x, Literal[1]) or not, we need to know specifically that x is 1, not just that x is an integer! That is what goes wrong in the above paragraph.

The case of Val is different, because it uses Plum's parametric machinery under the hood. In that case, we have that type(Val(1)) == Val[1]. Hence, if x = Val(1) and we know just type(x), then we know that x is of the type Val[1], so x must have been Val(1). Therefore, Val is a faithful type. Val, however, is not a free lunch: constructing Val(1) is relatively expensive compared to a lightweight wrapper object.

ggoretkin-bdai · 2023-06-26T20:02:38Z

Thanks for the explanation, @wesselb .

I gave your suggestion a try, and I did not notice a significant performance difference. I extracted a minimal example here:
https://github.com/ggoretkin-bdai/investigate-plum-performance/blob/2a9675cf340fe4f1f3adec42924cb154d453fe96/test/test_zoo.py#L8-L10

In [1]: from plum import Val

In [2]: Val.__faithful__
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[2], line 1
----> 1 Val.__faithful__

AttributeError: type object 'Val' has no attribute '__faithful__'

In [3]: %run test/test_zoo.py

In [4]: %timeit workload_multiple_dispatch()
1.27 s ± 25.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [5]: Val.__faithful__ = True

In [6]: %timeit workload_multiple_dispatch()
1.23 s ± 4.79 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [7]: %timeit workload_elif_chain()
34.1 ms ± 975 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

wesselb · 2023-06-27T18:05:45Z

Hey @ggoretkin-bdai! Ah... I think my benchmark that I posted above is incorrect. By default Val is actually assumed to be faithful. That's my bad! What appears to be going wrong is that Val is pretty expensive to construct:

In [1]: from plum import dispatch, Val, clear_all_cache

In [2]: @dispatch
   ...: def f(x: Val[1]):
   ...:     pass
   ...:

In [3]: x = Val(1)

In [4]: %timeit f(x)
571 ns ± 7.19 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [5]: %timeit f(Val(1))
38.2 µs ± 411 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [6]: %timeit Val(1)
36.5 µs ± 539 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Currently, parametric isn't optimised for performance, so it might be possible to make constructing Val(1) faster, if that is what you would be after. Alternatively, you could use caching, in this way:

In [7]: val = {1: Val(1)}

In [8]: %timeit f(val[1])
663 ns ± 36 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

ggoretkin-bdai · 2023-06-27T21:20:06Z

The ultimate use case I (currently) care about is the one in benchmark: defining convert methods. If there is a way to dispatch directly on the type via something like https://docs.julialang.org/en/v1/base/base/#Core.Type , that would feel the most natural for me, and would not require constructing a Val. However, this would be a new feature in Plum, and it might have other performance implications.

wesselb · 2023-06-28T06:01:25Z

How about something like this?

In [1]: from plum import dispatch, Val

In [2]: _cache = {}

In [3]: def convert(t, x):
   ...:     try:
   ...:         val = _cache[t]
   ...:     except KeyError:
   ...:         _cache[t] = Val(t)
   ...:         val = _cache[t]
   ...:     return _convert(val, x)
   ...:

In [4]: class A:
   ...:     def __init__(self, x):
   ...:         self.x = x
   ...:

In [5]: class B:
   ...:     def __init__(self, x):
   ...:         self.x = x
   ...:

In [6]: @dispatch
   ...: def _convert(t: Val[A], x: B):
   ...:     return A(x.x)
   ...:

In [7]: @dispatch
   ...: def _convert(t: Val[B], x: A):
   ...:     return B(x.x)
   ...:

In [8]: convert(A, B(1))
Out[8]: <__main__.A at 0x7f5cc6728b50>

In [9]: %timeit convert(A, B(1))
1.88 µs ± 65.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [10]: %timeit A(B(1).x)
592 ns ± 26 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

You might be able to squeeze out a little more performance, but this looks pretty good to me already.

ggoretkin-bdai · 2023-06-28T08:19:37Z

With the caching you suggest, for the benchmark I was using, Plum is now only 7x slower, versus 50x slower: ggoretkin-bdai/investigate-plum-performance@e07f658

A couple comments

Should Plum perform this caching internally? Why not memoize is_type?
Independently, I probably should use the existing plum.convert functionality, which side-steps the issue of Val creation entirely. This improves performance to 4x penalty! ggoretkin-bdai/investigate-plum-performance@bf27b3a

I am now trying to see how I can avoid annotating the from/to types twice:

@conversion_method(type_from=SpatialMath_SE3, type_to=BD_SE3Pose) # type: ignore[no-redef]
def convert_whatever(from_: SpatialMath_SE3) -> BD_SE3Pose:  # noqa: F811
    return BD_SE3Pose(from_.data5)

I need them annotated in the type hints, but would like to avoid annotating them in the decorator, as is achieved via:

plum/plum/function.py

Line 260 in 80b614f

signature = extract_signature(f, precedence=precedence)

signature (:class:.signature.Signature, optional): Signature. If it is not given, it will be derived from f.

ggoretkin-bdai · 2023-06-28T08:42:15Z

Here is an attempt that appears to work:
ggoretkin-bdai/investigate-plum-performance@3af609f

by defining

def conversion_method_from_signature(f):
    """Decorator to add a conversion method to convert an object from one
    type to another.

    Like `plum.conversion_method` but the arguments:
        type_from (type): Type to convert from.
        type_to (type): Type to convert to.
    are extracted from the type annotations
    """
    signature = plum.extract_signature(f)
    [type_from] = signature.types
    type_to = signature.return_type
    plum.promotion.add_conversion_method(type_from, type_to, f)

wesselb · 2023-06-28T17:33:41Z

@ggoretkin-bdai, to answer your questions:

Yes, ideally more caching should happen to optimise performance. I think that is_type can be safely cached, though I should think this over more carefully. If you'd want to submit a PR, then that would be more than welcome. :) Otherwise, I'll put it on the list of TODOs.
Great!! 4x is getting closer. plum.convert definitely isn't yet optimised for performance, so perhaps something can still be squeezed there.

Your conversion_method_from_signature looks great. :) That's exactly what I would've suggested.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance of `is_type` #86

performance of `is_type` #86

ggoretkin-bdai commented Jun 20, 2023

wesselb commented Jun 26, 2023

ggoretkin-bdai commented Jun 26, 2023

wesselb commented Jun 26, 2023

ggoretkin-bdai commented Jun 26, 2023

wesselb commented Jun 27, 2023

ggoretkin-bdai commented Jun 27, 2023 •

edited

Loading

wesselb commented Jun 28, 2023

ggoretkin-bdai commented Jun 28, 2023 •

edited

Loading

ggoretkin-bdai commented Jun 28, 2023 •

edited

Loading

wesselb commented Jun 28, 2023

performance of is_type #86

performance of is_type #86

Comments

ggoretkin-bdai commented Jun 20, 2023

wesselb commented Jun 26, 2023

ggoretkin-bdai commented Jun 26, 2023

wesselb commented Jun 26, 2023

ggoretkin-bdai commented Jun 26, 2023

wesselb commented Jun 27, 2023

ggoretkin-bdai commented Jun 27, 2023 • edited Loading

wesselb commented Jun 28, 2023

ggoretkin-bdai commented Jun 28, 2023 • edited Loading

ggoretkin-bdai commented Jun 28, 2023 • edited Loading

wesselb commented Jun 28, 2023

performance of `is_type` #86

performance of `is_type` #86

ggoretkin-bdai commented Jun 27, 2023 •

edited

Loading

ggoretkin-bdai commented Jun 28, 2023 •

edited

Loading

ggoretkin-bdai commented Jun 28, 2023 •

edited

Loading