Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Builtins] Add pattern matching builtins #6530

Merged

Conversation

effectfully
Copy link
Contributor

This PR is forked from #5486, it's squashed and without a lot of noise in the comments.

The main change is replacing

data BuiltinRuntime val
    = BuiltinCostedResult ExBudgetStream ~(BuiltinResult val)
    | <...>

with

data BuiltinRuntime val
    = BuiltinCostedResult ExBudgetStream ~(BuiltinResult (HeadSpine val))
    | <...>

where HeadSpine is a fancy way of saying NonEmpty:

-- | A non-empty spine. Isomorphic to 'NonEmpty', except is strict and is defined as a single
-- recursive data type.
data Spine a
    = SpineLast a
    | SpineCons a (Spine a)

-- | The head-spine form of an iterated application. Provides O(1) access to the head of the
-- application. Isomorphic to @NonEmpty@, except is strict and the no-spine case is made a separate
-- constructor for performance reasons (it only takes a single pattern match to access the head when
-- there's no spine this way, while otherwise we'd also need to match on the spine to ensure that
-- it's empty -- and the no-spine case is by far the most common one, hence we want to optimize it).
data HeadSpine a
    = HeadOnly a
    | HeadSpine a (Spine a)

(we define a separate type, because we want strictness, and you don't see any bangs, because it's in a module with StrictData enabled).

The idea is that a builtin application can return a function applied to a bunch of arguments, which is exactly what we need to be able to express caseList

caseList xs0 f z = case xs0 of
   []   -> z
   x:xs -> f x xs

as a builtin:

-- | Take a function and a list of arguments and apply the former to the latter.
headSpine :: Opaque val asToB -> [val] -> Opaque (HeadSpine val) b
headSpine (Opaque f) = Opaque . \case
    []      -> HeadOnly f
    x0 : xs ->
        -- It's critical to use 'foldr' here, so that deforestation kicks in.
        -- See Note [Definition of foldl'] in "GHC.List" and related Notes around for an explanation
        -- of the trick.
        HeadSpine f $ foldr (\x2 r x1 -> SpineCons x1 $ r x2) SpineLast xs x0

instance uni ~ DefaultUni => ToBuiltinMeaning uni DefaultFun where
    <...>
    toBuiltinMeaning _ver CaseList =
        let caseListDenotation
                :: Opaque val (LastArg a b)
                -> Opaque val (a -> [a] -> b)
                -> SomeConstant uni [a]
                -> BuiltinResult (Opaque (HeadSpine val) b)
            caseListDenotation z f (SomeConstant (Some (ValueOf uniListA xs0))) = do
                case uniListA of
                    DefaultUniList uniA -> pure $ case xs0 of
                        []     -> headSpine z []                                             -- [1]
                        x : xs -> headSpine f [fromValueOf uniA x, fromValueOf uniListA xs]  -- [2]
                    _ ->
                        -- See Note [Structural vs operational errors within builtins].
                        throwing _StructuralUnliftingError "Expected a list but got something else"
            {-# INLINE caseListDenotation #-}
        in makeBuiltinMeaning
            caseListDenotation
            (runCostingFunThreeArguments . unimplementedCostingFun)

Being able to express [1] (representing z) and [2] (representing f x xs) is precisely what this PR enables.

Adding support for the new functionality to the CEK machine is trivial. All we need is a way to push a Spine of arguments onto the context:

    -- | Push arguments onto the stack. The first argument will be the most recent entry.
    pushArgs
        :: Spine (CekValue uni fun ann)
        -> Context uni fun ann
        -> Context uni fun ann
    pushArgs args ctx = foldr FrameAwaitFunValue ctx args

and a HeadSpine version of returnCek:

    -- | Evaluate a 'HeadSpine' by pushing the arguments (if any) onto the stack and proceeding with
    -- the returning phase of the CEK machine.
    returnCekHeadSpine
        :: Context uni fun ann
        -> HeadSpine (CekValue uni fun ann)
        -> CekM uni fun s (Term NamedDeBruijn uni fun ())
    returnCekHeadSpine ctx (HeadOnly  x)    = returnCek ctx x
    returnCekHeadSpine ctx (HeadSpine f xs) = returnCek (pushArgs xs ctx) f

Then replacing

                BuiltinSuccess x ->
                    returnCek ctx x

with

                BuiltinSuccess fXs ->
                    returnCekHeadSpine ctx fXs

(and similarly for BuiltinSuccessWithLogs) will do the trick.

We used to define caseList in terms of IfThenElse, NullList and either HeadList or TailList depending on the result of NullList, i.e. three builtin calls in the worst and in the best case. Then we introduced ChooseList, which replaced both IfThenElse and NullList in caseList thus bringing total amount of builtin calls down to 2 in all cases. This turned out to have a substantial impact on performance. This PR allows us to bring total number of builtin calls per caseList invokation down to 1 -- the CaseList builtin itself.

Comment on lines -1010 to +1083
(`$fUnsafeFromDataList_$cunsafeFromBuiltinData`
{TxInInfo}
`$fUnsafeFromDataScriptContext_$cunsafeFromBuiltinData`
(headList {data} args))
(`$fUnsafeFromDataList_$cunsafeFromBuiltinData`
{TxInInfo}
`$fUnsafeFromDataScriptContext_$cunsafeFromBuiltinData`
(headList {data} l))
(`$fUnsafeFromDataList_$cunsafeFromBuiltinData`
{TxOut}
`$fUnsafeFromDataTxOut_$cunsafeFromBuiltinData`
(headList {data} l))
(let
!d : data = headList {data} args
in
go (unListData d))
(let
!d : data = headList {data} l
in
go (unListData d))
(let
!d : data = headList {data} l
in
go (unListData d))
Copy link
Contributor Author

@effectfully effectfully Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We end up having slightly more PIR code sometimes, but I suppose the reason is that we get more stuff inlined as you can see here. All those go are different optimized instantiations of $fUnsafeFromDataScriptContext_$cunsafeFromBuiltinData, so it's not surprising that we get more PIR code.

({cpu: 13100114647000
| mem: 13100000559240})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to unimplementedCostingFun.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put in an approximate cost model, and see how these numbers look? I assume the cost model will be in a similar range to chooseLists? Even if it is a bad guess, it won't be more meaningless than these numbers!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, I've done it already, just forgot to mention it. Here are the numbers:

+++ b/plutus-benchmark/lists/test/Lookup/9.6/match-builtin-list-50.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 4254600144
-| mem: 18263632})
\ No newline at end of file
+({cpu: 2884053694
+| mem: 12257832})

+++ b/plutus-benchmark/lists/test/Sum/9.6/left-fold-built-in.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 123874594
-| mem: 533932})
\ No newline at end of file
+({cpu: 86545294
+| mem: 397232})

+++ b/plutus-benchmark/lists/test/Sum/9.6/left-fold-data.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 279116232
-| mem: 1124130})
\ No newline at end of file
+({cpu: 215640257
+| mem: 912434})

+++ b/plutus-benchmark/lists/test/Sum/9.6/right-fold-built-in.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 128674594
-| mem: 563932})
\ No newline at end of file
+({cpu: 91345294
+| mem: 427232})

+++ b/plutus-benchmark/marlowe/test/semantics/9.6/f2a8fd2014922f0d8e01541205d47e9bb2d4e54333bdd408cbe7c47c55e73ae4.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 689404030
-| mem: 2815274})
\ No newline at end of file
+({cpu: 654073429
+| mem: 2668846})

+++ b/plutus-benchmark/script-contexts/test/9.6/checkScriptContext2-20.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 281562223
-| mem: 1042586})
\ No newline at end of file
+({cpu: 268649443
+| mem: 1019846})
\ No newline at end of

+++ b/plutus-benchmark/script-contexts/test/9.6/checkScriptContext2-4.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 84515135
-| mem: 323994})
\ No newline at end of file
+({cpu: 80537379
+| mem: 310726})

+++ b/plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/currencySymbolValueOf.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 23159162
-| mem: 65580})
\ No newline at end of file
+({cpu: 18013406
+| mem: 45012})

+++ b/plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/geq1.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 620731320
-| mem: 1881010})
\ No newline at end of file
+({cpu: 527159505
+| mem: 1514810})

+++ b/plutus-tx-plugin/test/Budget/9.6/builtinListIndexing.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 8401207
-| mem: 33930})
\ No newline at end of file
+({cpu: 6452329
+| mem: 27546})

+++ b/plutus-tx-plugin/test/Budget/9.6/map2-budget.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 127361368
-| mem: 398526})
\ No newline at end of file
+({cpu: 105442279
+| mem: 312734})

Basically, reflect the speedup percentages very well, as one'd expect.

I assume we don't want to commit the approximate cost model, given that it'll increase the changes of us accidentally enabling the feature for the users prematurely.

@@ -43,7 +43,7 @@ import Data.ByteString (ByteString)
-- > (fI : integer -> r)
-- > (fB : bytestring -> r) ->
-- > fix {data} {r} \(rec : data -> r) (d : data) ->
-- > caseData
-- > matchData
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aiken folks asked us to make caseList and caseData accept list/data as the last argument (which makes a lot of sense optimization-wise), so I made that and renamed all existing occurrences of caseList/caseData to matchList/matchData for consistency (as we already have those and they accept list/data as the first argument). I'd much prefer to do it the opposite way, but it's too late at this point, so consistency and backwards compatibility win.

KnownTypeAst PLC.TyName DefaultUni a =>
TypeRep a ->
PLC.Term PLC.TyName PLC.Name DefaultUni PLC.DefaultFun ()
smallTerm tr0 = go (toTypeAst tr0) tr0 where
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have builtins accepting functions, we have to generate functions for various tests.

@@ -0,0 +1 @@
all a. list a -> (all r. r -> (a -> list a -> r) -> r)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hate those parens, but we'll probably never have the time to implement the idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The elaborator needed surprisingly few tweaks to keep track of the current context.

275
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not bad.

go t = fail $ "Failed to decode builtin tag, got: " ++ show t

size _ n = n + builtinTagWidth

{- Note [Legacy pattern matching on built-in types]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an old Note, I just moved it to the bottom of the file, so that we still have an explanation for all the chooseList etc stuff while not wasting a lot of precious space in the middle of the file.

@@ -183,7 +205,7 @@ testCosts
:: BuiltinSemanticsVariant DefaultFun
-> BuiltinsRuntime DefaultFun Term
-> DefaultFun
-> Assertion
-> TestTree
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add type checking to that test, because I accidentally generated a bunch of ill-typed code and the test was happy to swallow that.

@effectfully
Copy link
Contributor Author

Latest benchmarking results (before the benchmarking machine caught a flu) are here.

@effectfully effectfully force-pushed the effectfully/builtins/add-pattern-matching-builtins branch from 15f6399 to 1d1f1b5 Compare October 1, 2024 21:28
@effectfully effectfully force-pushed the effectfully/builtins/add-pattern-matching-builtins branch from 1d1f1b5 to 0777db3 Compare October 1, 2024 23:21
Copy link
Contributor

@ana-pantilie ana-pantilie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the costing is all messed up because we haven't introduced a cost model yet, right? So that will come in a future PR?

I didn't understand all of the details in the code, but I think I now finally understand the big picture and how we can use the new evaluator. Thank you for this!

Approving because it makes sense to me, but I am not very familiar with the code so maybe someone more knowledgeable should also approve.

@effectfully
Copy link
Contributor Author

I assume the costing is all messed up because we haven't introduced a cost model yet, right? So that will come in a future PR?

Yes and yes.

Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is correctness tested, e.g., verifying that CaseList behaves the same as the equivalent term defined via ChooseList?

3157
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea of the reason for the increased sizes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that it's that the new definitions are more inliner-friendly, which creates some inlining and monomorphization opportunities, which increases code size, see this comment.

@effectfully effectfully force-pushed the effectfully/builtins/add-pattern-matching-builtins branch from fe85db0 to f173b1f Compare October 3, 2024 14:57
Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except that clarification needed for the testing question above.

@effectfully effectfully merged commit 557a2c7 into master Oct 18, 2024
8 checks passed
@effectfully effectfully deleted the effectfully/builtins/add-pattern-matching-builtins branch October 18, 2024 13:35
v0d1ch pushed a commit to v0d1ch/plutus that referenced this pull request Dec 6, 2024
The main change is replacing

```haskell
data BuiltinRuntime val
    = BuiltinCostedResult ExBudgetStream ~(BuiltinResult val)
    | <...>
```

with

```haskell
data BuiltinRuntime val
    = BuiltinCostedResult ExBudgetStream ~(BuiltinResult (HeadSpine val))
    | <...>
```

where `HeadSpine` is a fancy way of saying `NonEmpty`:

```haskell
-- | A non-empty spine. Isomorphic to 'NonEmpty', except is strict and is defined as a single
-- recursive data type.
data Spine a
    = SpineLast a
    | SpineCons a (Spine a)

-- | The head-spine form of an iterated application. Provides O(1) access to the head of the
-- application. Isomorphic to @nonempty@, except is strict and the no-spine case is made a separate
-- constructor for performance reasons (it only takes a single pattern match to access the head when
-- there's no spine this way, while otherwise we'd also need to match on the spine to ensure that
-- it's empty -- and the no-spine case is by far the most common one, hence we want to optimize it).
data HeadSpine a
    = HeadOnly a
    | HeadSpine a (Spine a)
```

(we define a separate type, because we want strictness, and you don't see any bangs, because it's in a module with `StrictData` enabled).

The idea is that a builtin application can return a function applied to a bunch of arguments, which is exactly what we need to be able to express `caseList`

```haskell
caseList xs0 f z = case xs0 of
   []   -> z
   x:xs -> f x xs
```

as a builtin:

```haskell
-- | Take a function and a list of arguments and apply the former to the latter.
headSpine :: Opaque val asToB -> [val] -> Opaque (HeadSpine val) b
headSpine (Opaque f) = Opaque . \case
    []      -> HeadOnly f
    x0 : xs ->
        -- It's critical to use 'foldr' here, so that deforestation kicks in.
        -- See Note [Definition of foldl'] in "GHC.List" and related Notes around for an explanation
        -- of the trick.
        HeadSpine f $ foldr (\x2 r x1 -> SpineCons x1 $ r x2) SpineLast xs x0

instance uni ~ DefaultUni => ToBuiltinMeaning uni DefaultFun where
    <...>
    toBuiltinMeaning _ver CaseList =
        let caseListDenotation
                :: Opaque val (LastArg a b)
                -> Opaque val (a -> [a] -> b)
                -> SomeConstant uni [a]
                -> BuiltinResult (Opaque (HeadSpine val) b)
            caseListDenotation z f (SomeConstant (Some (ValueOf uniListA xs0))) = do
                case uniListA of
                    DefaultUniList uniA -> pure $ case xs0 of
                        []     -> headSpine z []                                             -- [1]
                        x : xs -> headSpine f [fromValueOf uniA x, fromValueOf uniListA xs]  -- [2]
                    _ ->
                        -- See Note [Structural vs operational errors within builtins].
                        throwing _StructuralUnliftingError "Expected a list but got something else"
            {-# INLINE caseListDenotation #-}
        in makeBuiltinMeaning
            caseListDenotation
            (runCostingFunThreeArguments . unimplementedCostingFun)
```

Being able to express [1] (representing `z`) and [2] (representing `f x xs`) is precisely what this PR enables.

Adding support for the new functionality to the CEK machine is trivial. All we need is a way to push a `Spine` of arguments onto the context:

```haskell
    -- | Push arguments onto the stack. The first argument will be the most recent entry.
    pushArgs
        :: Spine (CekValue uni fun ann)
        -> Context uni fun ann
        -> Context uni fun ann
    pushArgs args ctx = foldr FrameAwaitFunValue ctx args
```

and a `HeadSpine` version of `returnCek`:

```haskell
    -- | Evaluate a 'HeadSpine' by pushing the arguments (if any) onto the stack and proceeding with
    -- the returning phase of the CEK machine.
    returnCekHeadSpine
        :: Context uni fun ann
        -> HeadSpine (CekValue uni fun ann)
        -> CekM uni fun s (Term NamedDeBruijn uni fun ())
    returnCekHeadSpine ctx (HeadOnly  x)    = returnCek ctx x
    returnCekHeadSpine ctx (HeadSpine f xs) = returnCek (pushArgs xs ctx) f
```

Then replacing

```haskell
                BuiltinSuccess x ->
                    returnCek ctx x
```

with

```haskell
                BuiltinSuccess fXs ->
                    returnCekHeadSpine ctx fXs
```

(and similarly for `BuiltinSuccessWithLogs`) will do the trick.

We used to define `caseList` in terms of `IfThenElse`, `NullList` and either `HeadList` or `TailList` depending on the result of `NullList`, i.e. three builtin calls in the worst and in the best case. Then we introduced `ChooseList`, which replaced both `IfThenElse` and `NullList` in `caseList` thus bringing total amount of builtin calls down to 2 in all cases. This turned out to have a [substantial](IntersectMBO#4119 (review)) impact on performance. This PR allows us to bring total number of builtin calls per `caseList` invokation down to 1 -- the `CaseList` builtin itself.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants