-
Notifications
You must be signed in to change notification settings - Fork 482
Description
This issue is for discussing approaches to improving performance of pattern matching for built-in types.
The current approach is inefficient. Details can be found in this Note, the gist is that currently pattern matching over lists is implemented as (pseudocode)
matchList :: [a] -> b -> (a -> [a] -> b) -> b
matchList xs z f = chooseList xs (\_ -> z) (\_ -> f (head xs) (tail xs)) ()where chooseList, head and tail are all built-in functions (chooseList returns either its second or its third argument depending on whether its first argument is null or not, respectively). Therefore in case of nil we end up performing one builtin call and in case of cons we end up performing three builtin calls. Note how we also have to pass a unit argument around in order not to always evaluate all branches: Plutus is strict, hence chooseList xs z (f (head xs) (tail xs)) would have the wrong semantics.
There have been two proposals on how to implement faster pattern matching for built-in types:
- allow expressing pattern matching built-in functions directly, implemented in [Builtins] Add support for pattern matching builtins #5486
- piggy-back on the pattern matching machinery that we use for sums-of-products, implemented in [Builtins] Add 'ListToConstr' and 'DataToConstr' #5704
Both PRs are documented, see the initial comment on each PR if you want to understand the details of design and implementation.
Both approaches work and give us comparable speedup on built-in-list-specific benchmarks.
This is what we get for (1):
This is what we get for (2):
It may appear that (2) is faster, however the benchmarking machine apparently is capable (see this slack discussion) of being wrong by 6% and probably even more, hence we can't really rely on these results, but it's still clear that both the approaches give us meaningful improvement of about the same scale.
If we analyze performance analytically, this is what we'll find:
- (1) has to keep
unitarguments around in order for pattern matching not to force all the branches at once and only force the picked one, while (2), being backed by SOPs, has no such restriction sincecaseis specifically designed to be a proper pattern matching construct (unlike function application which is strict in Plutus). E.g. for (1) we have the following diff:
while for (2) it's
I.e. (2) wins for this one.
- in (1)
matchListis literally a single built-in function call, while in (2) it's a built-in function call + acase:
which affects not only performance, but also size, which is an even scarcer resource.
I.e. (1) wins for this one.
- (1) requires us to amend the builtins machinery in a way that allows for returning an applied function. Not only is this a much larger change (propagating through many parts of the builtins machinery, including tests) compared to what (2) requires, it also introduces a slowdown for all existing builtins, because handling exactly one output of a built-in function is a bit faster than handling any non-zero number of them (if a builtin returns
f x ywe simply return all of those terms separately and let the evaluator handle reducing the application -- we have to do that, because each evaluator deals with reducing applications differently), because one doesn't need to case on the result to figure out if it contains a single output or multiple of them.
I.e. (2) wins for this one.
- the whole issue we've been discussing here is improving performance of pattern matching for built-in types. Extending the builtins machinery seems very appropriate for that, particularly with something as straightforward as "allow for returning a function application", while teaching builtins about SOPs and hardcoding a specific SOP representation right into some built-in functions feels weird -- why do builtins and SOPs have to be intertwined this way?
Maybe it's fine, but I believe (1) wins for this one.
- as per Michael's comment SOPs aren't supported with all versions of Plutus, so we need to figure out what to do for early ones that don't support SOPs
I.e. (1) wins for this one.
Overall, there's no clear winner. Performance comparison is unclear with both approaches delivering meaningful improvement, although my very subjective feeling is that (2) slightly wins when it comes to performance. But also (1) makes builtins more expressive, maybe for some another reason we'll end up needing to return function applications from builtins eventually anyway?
I'm personally torn between the two options. Which one should we choose?






