-
Notifications
You must be signed in to change notification settings - Fork 483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Builtins] Add pattern matching builtins #6530
[Builtins] Add pattern matching builtins #6530
Conversation
21b50a0
to
15f6399
Compare
(`$fUnsafeFromDataList_$cunsafeFromBuiltinData` | ||
{TxInInfo} | ||
`$fUnsafeFromDataScriptContext_$cunsafeFromBuiltinData` | ||
(headList {data} args)) | ||
(`$fUnsafeFromDataList_$cunsafeFromBuiltinData` | ||
{TxInInfo} | ||
`$fUnsafeFromDataScriptContext_$cunsafeFromBuiltinData` | ||
(headList {data} l)) | ||
(`$fUnsafeFromDataList_$cunsafeFromBuiltinData` | ||
{TxOut} | ||
`$fUnsafeFromDataTxOut_$cunsafeFromBuiltinData` | ||
(headList {data} l)) | ||
(let | ||
!d : data = headList {data} args | ||
in | ||
go (unListData d)) | ||
(let | ||
!d : data = headList {data} l | ||
in | ||
go (unListData d)) | ||
(let | ||
!d : data = headList {data} l | ||
in | ||
go (unListData d)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We end up having slightly more PIR code sometimes, but I suppose the reason is that we get more stuff inlined as you can see here. All those go
are different optimized instantiations of $fUnsafeFromDataScriptContext_$cunsafeFromBuiltinData
, so it's not surprising that we get more PIR code.
({cpu: 13100114647000 | ||
| mem: 13100000559240}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is due to unimplementedCostingFun
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put in an approximate cost model, and see how these numbers look? I assume the cost model will be in a similar range to chooseList
s? Even if it is a bad guess, it won't be more meaningless than these numbers!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, I've done it already, just forgot to mention it. Here are the numbers:
+++ b/plutus-benchmark/lists/test/Lookup/9.6/match-builtin-list-50.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 4254600144
-| mem: 18263632})
\ No newline at end of file
+({cpu: 2884053694
+| mem: 12257832})
+++ b/plutus-benchmark/lists/test/Sum/9.6/left-fold-built-in.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 123874594
-| mem: 533932})
\ No newline at end of file
+({cpu: 86545294
+| mem: 397232})
+++ b/plutus-benchmark/lists/test/Sum/9.6/left-fold-data.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 279116232
-| mem: 1124130})
\ No newline at end of file
+({cpu: 215640257
+| mem: 912434})
+++ b/plutus-benchmark/lists/test/Sum/9.6/right-fold-built-in.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 128674594
-| mem: 563932})
\ No newline at end of file
+({cpu: 91345294
+| mem: 427232})
+++ b/plutus-benchmark/marlowe/test/semantics/9.6/f2a8fd2014922f0d8e01541205d47e9bb2d4e54333bdd408cbe7c47c55e73ae4.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 689404030
-| mem: 2815274})
\ No newline at end of file
+({cpu: 654073429
+| mem: 2668846})
+++ b/plutus-benchmark/script-contexts/test/9.6/checkScriptContext2-20.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 281562223
-| mem: 1042586})
\ No newline at end of file
+({cpu: 268649443
+| mem: 1019846})
\ No newline at end of
+++ b/plutus-benchmark/script-contexts/test/9.6/checkScriptContext2-4.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 84515135
-| mem: 323994})
\ No newline at end of file
+({cpu: 80537379
+| mem: 310726})
+++ b/plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/currencySymbolValueOf.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 23159162
-| mem: 65580})
\ No newline at end of file
+({cpu: 18013406
+| mem: 45012})
+++ b/plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/geq1.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 620731320
-| mem: 1881010})
\ No newline at end of file
+({cpu: 527159505
+| mem: 1514810})
+++ b/plutus-tx-plugin/test/Budget/9.6/builtinListIndexing.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 8401207
-| mem: 33930})
\ No newline at end of file
+({cpu: 6452329
+| mem: 27546})
+++ b/plutus-tx-plugin/test/Budget/9.6/map2-budget.budget.golden
@@ -1,2 +1,2 @@
-({cpu: 127361368
-| mem: 398526})
\ No newline at end of file
+({cpu: 105442279
+| mem: 312734})
Basically, reflect the speedup percentages very well, as one'd expect.
I assume we don't want to commit the approximate cost model, given that it'll increase the changes of us accidentally enabling the feature for the users prematurely.
@@ -43,7 +43,7 @@ import Data.ByteString (ByteString) | |||
-- > (fI : integer -> r) | |||
-- > (fB : bytestring -> r) -> | |||
-- > fix {data} {r} \(rec : data -> r) (d : data) -> | |||
-- > caseData | |||
-- > matchData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aiken folks asked us to make caseList
and caseData
accept list/data as the last argument (which makes a lot of sense optimization-wise), so I made that and renamed all existing occurrences of caseList
/caseData
to matchList
/matchData
for consistency (as we already have those and they accept list/data as the first argument). I'd much prefer to do it the opposite way, but it's too late at this point, so consistency and backwards compatibility win.
KnownTypeAst PLC.TyName DefaultUni a => | ||
TypeRep a -> | ||
PLC.Term PLC.TyName PLC.Name DefaultUni PLC.DefaultFun () | ||
smallTerm tr0 = go (toTypeAst tr0) tr0 where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have builtins accepting functions, we have to generate functions for various tests.
@@ -0,0 +1 @@ | |||
all a. list a -> (all r. r -> (a -> list a -> r) -> r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hate those parens, but we'll probably never have the time to implement the idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The elaborator needed surprisingly few tweaks to keep track of the current context.
275 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not bad.
go t = fail $ "Failed to decode builtin tag, got: " ++ show t | ||
|
||
size _ n = n + builtinTagWidth | ||
|
||
{- Note [Legacy pattern matching on built-in types] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an old Note, I just moved it to the bottom of the file, so that we still have an explanation for all the chooseList
etc stuff while not wasting a lot of precious space in the middle of the file.
@@ -183,7 +205,7 @@ testCosts | |||
:: BuiltinSemanticsVariant DefaultFun | |||
-> BuiltinsRuntime DefaultFun Term | |||
-> DefaultFun | |||
-> Assertion | |||
-> TestTree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably add type checking to that test, because I accidentally generated a bunch of ill-typed code and the test was happy to swallow that.
plutus-core/untyped-plutus-core/src/UntypedPlutusCore/Transform/CaseOfCase.hs
Show resolved
Hide resolved
Latest benchmarking results (before the benchmarking machine caught a flu) are here. |
15f6399
to
1d1f1b5
Compare
1d1f1b5
to
0777db3
Compare
plutus-core/untyped-plutus-core/src/UntypedPlutusCore/Evaluation/Machine/Cek/Internal.hs
Outdated
Show resolved
Hide resolved
plutus-core/untyped-plutus-core/src/UntypedPlutusCore/Evaluation/Machine/Cek/Internal.hs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the costing is all messed up because we haven't introduced a cost model yet, right? So that will come in a future PR?
I didn't understand all of the details in the code, but I think I now finally understand the big picture and how we can use the new evaluator. Thank you for this!
Approving because it makes sense to me, but I am not very familiar with the code so maybe someone more knowledgeable should also approve.
Yes and yes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is correctness tested, e.g., verifying that CaseList
behaves the same as the equivalent term defined via ChooseList
?
3157 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea of the reason for the increased sizes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that it's that the new definitions are more inliner-friendly, which creates some inlining and monomorphization opportunities, which increases code size, see this comment.
plutus-core/untyped-plutus-core/src/UntypedPlutusCore/Evaluation/Machine/Cek/Internal.hs
Show resolved
Hide resolved
fe85db0
to
f173b1f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good except that clarification needed for the testing question above.
…to effectfully/builtins/add-pattern-matching-builtins
The main change is replacing ```haskell data BuiltinRuntime val = BuiltinCostedResult ExBudgetStream ~(BuiltinResult val) | <...> ``` with ```haskell data BuiltinRuntime val = BuiltinCostedResult ExBudgetStream ~(BuiltinResult (HeadSpine val)) | <...> ``` where `HeadSpine` is a fancy way of saying `NonEmpty`: ```haskell -- | A non-empty spine. Isomorphic to 'NonEmpty', except is strict and is defined as a single -- recursive data type. data Spine a = SpineLast a | SpineCons a (Spine a) -- | The head-spine form of an iterated application. Provides O(1) access to the head of the -- application. Isomorphic to @nonempty@, except is strict and the no-spine case is made a separate -- constructor for performance reasons (it only takes a single pattern match to access the head when -- there's no spine this way, while otherwise we'd also need to match on the spine to ensure that -- it's empty -- and the no-spine case is by far the most common one, hence we want to optimize it). data HeadSpine a = HeadOnly a | HeadSpine a (Spine a) ``` (we define a separate type, because we want strictness, and you don't see any bangs, because it's in a module with `StrictData` enabled). The idea is that a builtin application can return a function applied to a bunch of arguments, which is exactly what we need to be able to express `caseList` ```haskell caseList xs0 f z = case xs0 of [] -> z x:xs -> f x xs ``` as a builtin: ```haskell -- | Take a function and a list of arguments and apply the former to the latter. headSpine :: Opaque val asToB -> [val] -> Opaque (HeadSpine val) b headSpine (Opaque f) = Opaque . \case [] -> HeadOnly f x0 : xs -> -- It's critical to use 'foldr' here, so that deforestation kicks in. -- See Note [Definition of foldl'] in "GHC.List" and related Notes around for an explanation -- of the trick. HeadSpine f $ foldr (\x2 r x1 -> SpineCons x1 $ r x2) SpineLast xs x0 instance uni ~ DefaultUni => ToBuiltinMeaning uni DefaultFun where <...> toBuiltinMeaning _ver CaseList = let caseListDenotation :: Opaque val (LastArg a b) -> Opaque val (a -> [a] -> b) -> SomeConstant uni [a] -> BuiltinResult (Opaque (HeadSpine val) b) caseListDenotation z f (SomeConstant (Some (ValueOf uniListA xs0))) = do case uniListA of DefaultUniList uniA -> pure $ case xs0 of [] -> headSpine z [] -- [1] x : xs -> headSpine f [fromValueOf uniA x, fromValueOf uniListA xs] -- [2] _ -> -- See Note [Structural vs operational errors within builtins]. throwing _StructuralUnliftingError "Expected a list but got something else" {-# INLINE caseListDenotation #-} in makeBuiltinMeaning caseListDenotation (runCostingFunThreeArguments . unimplementedCostingFun) ``` Being able to express [1] (representing `z`) and [2] (representing `f x xs`) is precisely what this PR enables. Adding support for the new functionality to the CEK machine is trivial. All we need is a way to push a `Spine` of arguments onto the context: ```haskell -- | Push arguments onto the stack. The first argument will be the most recent entry. pushArgs :: Spine (CekValue uni fun ann) -> Context uni fun ann -> Context uni fun ann pushArgs args ctx = foldr FrameAwaitFunValue ctx args ``` and a `HeadSpine` version of `returnCek`: ```haskell -- | Evaluate a 'HeadSpine' by pushing the arguments (if any) onto the stack and proceeding with -- the returning phase of the CEK machine. returnCekHeadSpine :: Context uni fun ann -> HeadSpine (CekValue uni fun ann) -> CekM uni fun s (Term NamedDeBruijn uni fun ()) returnCekHeadSpine ctx (HeadOnly x) = returnCek ctx x returnCekHeadSpine ctx (HeadSpine f xs) = returnCek (pushArgs xs ctx) f ``` Then replacing ```haskell BuiltinSuccess x -> returnCek ctx x ``` with ```haskell BuiltinSuccess fXs -> returnCekHeadSpine ctx fXs ``` (and similarly for `BuiltinSuccessWithLogs`) will do the trick. We used to define `caseList` in terms of `IfThenElse`, `NullList` and either `HeadList` or `TailList` depending on the result of `NullList`, i.e. three builtin calls in the worst and in the best case. Then we introduced `ChooseList`, which replaced both `IfThenElse` and `NullList` in `caseList` thus bringing total amount of builtin calls down to 2 in all cases. This turned out to have a [substantial](IntersectMBO#4119 (review)) impact on performance. This PR allows us to bring total number of builtin calls per `caseList` invokation down to 1 -- the `CaseList` builtin itself.
This PR is forked from #5486, it's squashed and without a lot of noise in the comments.
The main change is replacing
with
where
HeadSpine
is a fancy way of sayingNonEmpty
:(we define a separate type, because we want strictness, and you don't see any bangs, because it's in a module with
StrictData
enabled).The idea is that a builtin application can return a function applied to a bunch of arguments, which is exactly what we need to be able to express
caseList
as a builtin:
Being able to express [1] (representing
z
) and [2] (representingf x xs
) is precisely what this PR enables.Adding support for the new functionality to the CEK machine is trivial. All we need is a way to push a
Spine
of arguments onto the context:and a
HeadSpine
version ofreturnCek
:Then replacing
with
(and similarly for
BuiltinSuccessWithLogs
) will do the trick.We used to define
caseList
in terms ofIfThenElse
,NullList
and eitherHeadList
orTailList
depending on the result ofNullList
, i.e. three builtin calls in the worst and in the best case. Then we introducedChooseList
, which replaced bothIfThenElse
andNullList
incaseList
thus bringing total amount of builtin calls down to 2 in all cases. This turned out to have a substantial impact on performance. This PR allows us to bring total number of builtin calls percaseList
invokation down to 1 -- theCaseList
builtin itself.