Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Builtins] Monomorphize 'readKnown' #4307

Merged

Conversation

effectfully
Copy link
Contributor

Do not look here yet.

@effectfully
Copy link
Contributor Author

/benchmark

@iohk-devops
Copy link

Comparing benchmark results of 'c8c5183f7' (base) and '916535235' (PR)

Script c8c5183 9165352 Change
auction_1-1 383.8 μs 326.2 μs -15.0%
auction_1-2 1.153 ms 1.073 ms -6.9%
auction_1-3 1.155 ms 1.078 ms -6.7%
auction_1-4 508.2 μs 430.1 μs -15.4%
auction_2-1 384.9 μs 329.4 μs -14.4%
auction_2-2 1.152 ms 1.071 ms -7.0%
auction_2-3 1.452 ms 1.351 ms -7.0%
auction_2-4 1.158 ms 1.072 ms -7.4%
auction_2-5 510.0 μs 428.7 μs -15.9%
crowdfunding-success-1 457.1 μs 386.8 μs -15.4%
crowdfunding-success-2 456.7 μs 386.7 μs -15.3%
crowdfunding-success-3 456.9 μs 388.3 μs -15.0%
currency-1 469.8 μs 421.6 μs -10.3%
escrow-redeem_1-1 712.9 μs 629.0 μs -11.8%
escrow-redeem_1-2 712.8 μs 628.9 μs -11.8%
escrow-redeem_2-1 821.1 μs 737.4 μs -10.2%
escrow-redeem_2-2 819.3 μs 738.6 μs -9.8%
escrow-redeem_2-3 820.5 μs 737.6 μs -10.1%
escrow-refund-1 338.6 μs 289.8 μs -14.4%
future-increase-margin-1 469.9 μs 421.0 μs -10.4%
future-increase-margin-2 1.036 ms 941.6 μs -9.1%
future-increase-margin-3 1.036 ms 939.6 μs -9.3%
future-increase-margin-4 947.0 μs 873.1 μs -7.8%
future-increase-margin-5 1.391 ms 1.300 ms -6.5%
future-pay-out-1 469.6 μs 420.7 μs -10.4%
future-pay-out-2 1.035 ms 945.3 μs -8.7%
future-pay-out-3 1.037 ms 940.7 μs -9.3%
future-pay-out-4 1.394 ms 1.295 ms -7.1%
future-settle-early-1 471.0 μs 419.8 μs -10.9%
future-settle-early-2 1.037 ms 943.9 μs -9.0%
future-settle-early-3 1.035 ms 943.6 μs -8.8%
future-settle-early-4 1.095 ms 1.018 ms -7.0%
game-sm-success_1-1 788.9 μs 702.5 μs -11.0%
game-sm-success_1-2 430.0 μs 366.6 μs -14.7%
game-sm-success_1-3 1.165 ms 1.076 ms -7.6%
game-sm-success_1-4 506.5 μs 429.2 μs -15.3%
game-sm-success_2-1 788.2 μs 701.7 μs -11.0%
game-sm-success_2-2 429.6 μs 365.1 μs -15.0%
game-sm-success_2-3 1.164 ms 1.074 ms -7.7%
game-sm-success_2-4 503.4 μs 428.9 μs -14.8%
game-sm-success_2-5 1.165 ms 1.076 ms -7.6%
game-sm-success_2-6 504.9 μs 427.2 μs -15.4%
multisig-sm-1 792.5 μs 713.2 μs -10.0%
multisig-sm-2 777.8 μs 699.4 μs -10.1%
multisig-sm-3 788.8 μs 708.9 μs -10.1%
multisig-sm-4 794.9 μs 718.1 μs -9.7%
multisig-sm-5 1.034 ms 964.8 μs -6.7%
multisig-sm-6 798.3 μs 715.4 μs -10.4%
multisig-sm-7 781.4 μs 701.4 μs -10.2%
multisig-sm-8 791.3 μs 708.2 μs -10.5%
multisig-sm-9 798.5 μs 714.8 μs -10.5%
multisig-sm-10 1.034 ms 960.3 μs -7.1%
ping-pong-1 652.6 μs 582.0 μs -10.8%
ping-pong-2 649.7 μs 582.5 μs -10.3%
ping-pong_2-1 415.6 μs 361.4 μs -13.0%
prism-1 358.3 μs 303.1 μs -15.4%
prism-2 847.3 μs 765.3 μs -9.7%
prism-3 730.6 μs 648.1 μs -11.3%
pubkey-1 304.9 μs 260.3 μs -14.6%
stablecoin_1-1 1.622 ms 1.490 ms -8.1%
stablecoin_1-2 422.1 μs 358.4 μs -15.1%
stablecoin_1-3 1.848 ms 1.687 ms -8.7%
stablecoin_1-4 448.2 μs 380.1 μs -15.2%
stablecoin_1-5 2.347 ms 2.126 ms -9.4%
stablecoin_1-6 553.3 μs 473.2 μs -14.5%
stablecoin_2-1 1.627 ms 1.494 ms -8.2%
stablecoin_2-2 422.8 μs 359.5 μs -15.0%
stablecoin_2-3 1.850 ms 1.692 ms -8.5%
stablecoin_2-4 448.5 μs 381.1 μs -15.0%
token-account-1 367.8 μs 327.5 μs -11.0%
token-account-2 649.7 μs 580.3 μs -10.7%
uniswap-1 735.0 μs 681.5 μs -7.3%
uniswap-2 447.0 μs 396.3 μs -11.3%
uniswap-3 2.971 ms 2.726 ms -8.2%
uniswap-4 733.1 μs 625.2 μs -14.7%
uniswap-5 2.145 ms 1.939 ms -9.6%
uniswap-6 696.7 μs 594.2 μs -14.7%
vesting-1 666.1 μs 603.4 μs -9.4%

@effectfully
Copy link
Contributor Author

WAT

@effectfully
Copy link
Contributor Author

/benchmark

@iohk-devops
Copy link

Comparing benchmark results of 'c8c5183f7' (base) and '916535235' (PR)

Script c8c5183 9165352 Change
auction_1-1 384.3 μs 326.4 μs -15.1%
auction_1-2 1.154 ms 1.073 ms -7.0%
auction_1-3 1.156 ms 1.075 ms -7.0%
auction_1-4 509.7 μs 427.1 μs -16.2%
auction_2-1 386.1 μs 326.6 μs -15.4%
auction_2-2 1.154 ms 1.073 ms -7.0%
auction_2-3 1.447 ms 1.354 ms -6.4%
auction_2-4 1.156 ms 1.075 ms -7.0%
auction_2-5 509.5 μs 428.0 μs -16.0%
crowdfunding-success-1 456.0 μs 388.5 μs -14.8%
crowdfunding-success-2 456.1 μs 386.2 μs -15.3%
crowdfunding-success-3 455.7 μs 386.6 μs -15.2%
currency-1 468.8 μs 418.2 μs -10.8%
escrow-redeem_1-1 714.8 μs 627.4 μs -12.2%
escrow-redeem_1-2 713.7 μs 628.9 μs -11.9%
escrow-redeem_2-1 822.7 μs 734.1 μs -10.8%
escrow-redeem_2-2 823.8 μs 733.0 μs -11.0%
escrow-redeem_2-3 822.1 μs 733.9 μs -10.7%
escrow-refund-1 339.4 μs 289.2 μs -14.8%
future-increase-margin-1 472.3 μs 422.8 μs -10.5%
future-increase-margin-2 1.040 ms 944.4 μs -9.2%
future-increase-margin-3 1.036 ms 942.6 μs -9.0%
future-increase-margin-4 945.9 μs 871.7 μs -7.8%
future-increase-margin-5 1.387 ms 1.299 ms -6.3%
future-pay-out-1 468.7 μs 419.5 μs -10.5%
future-pay-out-2 1.038 ms 941.5 μs -9.3%
future-pay-out-3 1.040 ms 941.1 μs -9.5%
future-pay-out-4 1.385 ms 1.296 ms -6.4%
future-settle-early-1 468.3 μs 419.8 μs -10.4%
future-settle-early-2 1.039 ms 943.5 μs -9.2%
future-settle-early-3 1.040 ms 939.7 μs -9.6%
future-settle-early-4 1.097 ms 1.018 ms -7.2%
game-sm-success_1-1 789.8 μs 702.2 μs -11.1%
game-sm-success_1-2 433.5 μs 365.1 μs -15.8%
game-sm-success_1-3 1.169 ms 1.078 ms -7.8%
game-sm-success_1-4 507.3 μs 428.1 μs -15.6%
game-sm-success_2-1 788.3 μs 700.1 μs -11.2%
game-sm-success_2-2 431.4 μs 363.2 μs -15.8%
game-sm-success_2-3 1.167 ms 1.071 ms -8.2%
game-sm-success_2-4 506.0 μs 425.4 μs -15.9%
game-sm-success_2-5 1.165 ms 1.075 ms -7.7%
game-sm-success_2-6 505.3 μs 427.7 μs -15.4%
multisig-sm-1 793.3 μs 712.0 μs -10.2%
multisig-sm-2 777.8 μs 696.9 μs -10.4%
multisig-sm-3 789.8 μs 705.4 μs -10.7%
multisig-sm-4 801.5 μs 711.8 μs -11.2%
multisig-sm-5 1.034 ms 956.8 μs -7.5%
multisig-sm-6 794.8 μs 713.6 μs -10.2%
multisig-sm-7 780.3 μs 699.9 μs -10.3%
multisig-sm-8 788.0 μs 709.8 μs -9.9%
multisig-sm-9 799.8 μs 716.2 μs -10.5%
multisig-sm-10 1.039 ms 957.3 μs -7.9%
ping-pong-1 656.0 μs 581.6 μs -11.3%
ping-pong-2 654.6 μs 581.2 μs -11.2%
ping-pong_2-1 419.6 μs 360.7 μs -14.0%
prism-1 361.4 μs 302.5 μs -16.3%
prism-2 852.3 μs 763.7 μs -10.4%
prism-3 730.8 μs 646.1 μs -11.6%
pubkey-1 306.4 μs 259.0 μs -15.5%
stablecoin_1-1 1.620 ms 1.484 ms -8.4%
stablecoin_1-2 423.9 μs 356.4 μs -15.9%
stablecoin_1-3 1.851 ms 1.686 ms -8.9%
stablecoin_1-4 451.1 μs 379.4 μs -15.9%
stablecoin_1-5 2.350 ms 2.131 ms -9.3%
stablecoin_1-6 557.2 μs 469.6 μs -15.7%
stablecoin_2-1 1.623 ms 1.485 ms -8.5%
stablecoin_2-2 424.2 μs 356.2 μs -16.0%
stablecoin_2-3 1.861 ms 1.681 ms -9.7%
stablecoin_2-4 451.7 μs 378.6 μs -16.2%
token-account-1 368.8 μs 326.2 μs -11.6%
token-account-2 653.8 μs 578.8 μs -11.5%
uniswap-1 738.3 μs 680.9 μs -7.8%
uniswap-2 448.3 μs 394.9 μs -11.9%
uniswap-3 2.979 ms 2.732 ms -8.3%
uniswap-4 736.2 μs 623.3 μs -15.3%
uniswap-5 2.142 ms 1.938 ms -9.5%
uniswap-6 700.5 μs 595.2 μs -15.0%
vesting-1 668.5 μs 602.2 μs -9.9%

@effectfully
Copy link
Contributor Author

/benchmark

@iohk-devops
Copy link

Comparing benchmark results of 'c8c5183f7' (base) and '916535235' (PR)

Script c8c5183 9165352 Change
auction_1-1 384.2 μs 325.9 μs -15.2%
auction_1-2 1.154 ms 1.067 ms -7.5%
auction_1-3 1.161 ms 1.072 ms -7.7%
auction_1-4 509.0 μs 428.1 μs -15.9%
auction_2-1 385.2 μs 328.6 μs -14.7%
auction_2-2 1.153 ms 1.074 ms -6.9%
auction_2-3 1.447 ms 1.356 ms -6.3%
auction_2-4 1.158 ms 1.076 ms -7.1%
auction_2-5 510.8 μs 429.6 μs -15.9%
crowdfunding-success-1 457.4 μs 387.2 μs -15.3%
crowdfunding-success-2 458.0 μs 386.7 μs -15.6%
crowdfunding-success-3 458.3 μs 386.5 μs -15.7%
currency-1 471.6 μs 420.7 μs -10.8%
escrow-redeem_1-1 713.8 μs 631.3 μs -11.6%
escrow-redeem_1-2 712.8 μs 628.8 μs -11.8%
escrow-redeem_2-1 822.8 μs 734.7 μs -10.7%
escrow-redeem_2-2 818.9 μs 735.6 μs -10.2%
escrow-redeem_2-3 818.2 μs 737.4 μs -9.9%
escrow-refund-1 337.4 μs 288.1 μs -14.6%
future-increase-margin-1 468.6 μs 422.3 μs -9.9%
future-increase-margin-2 1.036 ms 942.8 μs -9.0%
future-increase-margin-3 1.036 ms 942.3 μs -9.0%
future-increase-margin-4 946.0 μs 869.0 μs -8.1%
future-increase-margin-5 1.387 ms 1.297 ms -6.5%
future-pay-out-1 470.1 μs 419.9 μs -10.7%
future-pay-out-2 1.038 ms 945.0 μs -9.0%
future-pay-out-3 1.034 ms 944.1 μs -8.7%
future-pay-out-4 1.389 ms 1.297 ms -6.6%
future-settle-early-1 469.9 μs 419.3 μs -10.8%
future-settle-early-2 1.039 ms 945.9 μs -9.0%
future-settle-early-3 1.037 ms 944.3 μs -8.9%
future-settle-early-4 1.094 ms 1.022 ms -6.6%
game-sm-success_1-1 788.7 μs 701.8 μs -11.0%
game-sm-success_1-2 430.6 μs 365.0 μs -15.2%
game-sm-success_1-3 1.163 ms 1.074 ms -7.7%
game-sm-success_1-4 506.2 μs 427.4 μs -15.6%
game-sm-success_2-1 787.4 μs 703.0 μs -10.7%
game-sm-success_2-2 430.2 μs 364.3 μs -15.3%
game-sm-success_2-3 1.166 ms 1.072 ms -8.1%
game-sm-success_2-4 505.0 μs 427.9 μs -15.3%
game-sm-success_2-5 1.162 ms 1.074 ms -7.6%
game-sm-success_2-6 505.2 μs 426.2 μs -15.6%
multisig-sm-1 796.9 μs 711.2 μs -10.8%
multisig-sm-2 780.4 μs 698.1 μs -10.5%
multisig-sm-3 790.1 μs 706.6 μs -10.6%
multisig-sm-4 796.6 μs 716.0 μs -10.1%
multisig-sm-5 1.032 ms 961.1 μs -6.9%
multisig-sm-6 795.4 μs 713.3 μs -10.3%
multisig-sm-7 781.2 μs 699.2 μs -10.5%
multisig-sm-8 792.7 μs 708.6 μs -10.6%
multisig-sm-9 801.2 μs 715.5 μs -10.7%
multisig-sm-10 1.037 ms 957.4 μs -7.7%
ping-pong-1 652.7 μs 581.4 μs -10.9%
ping-pong-2 651.5 μs 582.6 μs -10.6%
ping-pong_2-1 416.3 μs 360.6 μs -13.4%
prism-1 359.4 μs 302.8 μs -15.7%
prism-2 849.8 μs 764.6 μs -10.0%
prism-3 730.7 μs 646.8 μs -11.5%
pubkey-1 305.6 μs 259.2 μs -15.2%
stablecoin_1-1 1.624 ms 1.486 ms -8.5%
stablecoin_1-2 423.4 μs 357.1 μs -15.7%
stablecoin_1-3 1.852 ms 1.690 ms -8.7%
stablecoin_1-4 448.9 μs 378.8 μs -15.6%
stablecoin_1-5 2.354 ms 2.125 ms -9.7%
stablecoin_1-6 555.8 μs 471.0 μs -15.3%
stablecoin_2-1 1.629 ms 1.485 ms -8.8%
stablecoin_2-2 424.0 μs 356.1 μs -16.0%
stablecoin_2-3 1.851 ms 1.692 ms -8.6%
stablecoin_2-4 449.4 μs 380.6 μs -15.3%
token-account-1 367.7 μs 329.0 μs -10.5%
token-account-2 650.1 μs 581.3 μs -10.6%
uniswap-1 735.3 μs 681.8 μs -7.3%
uniswap-2 447.1 μs 396.7 μs -11.3%
uniswap-3 2.974 ms 2.741 ms -7.8%
uniswap-4 732.7 μs 623.4 μs -14.9%
uniswap-5 2.138 ms 1.936 ms -9.4%
uniswap-6 697.1 μs 595.3 μs -14.6%
vesting-1 666.4 μs 602.2 μs -9.6%

@effectfully effectfully force-pushed the effectfully/builtins/performance/monomorphize-readKnown branch from 9165352 to 1db9fbb Compare January 3, 2022 01:40
@effectfully
Copy link
Contributor Author

/benchmark

@iohk-devops
Copy link

Comparing benchmark results of 'c8c5183f7' (base) and '1db9fbbd2' (PR)

Script c8c5183 1db9fbb Change
auction_1-1 384.1 μs 323.2 μs -15.9%
auction_1-2 1.151 ms 1.066 ms -7.4%
auction_1-3 1.159 ms 1.067 ms -7.9%
auction_1-4 510.8 μs 424.4 μs -16.9%
auction_2-1 386.3 μs 323.8 μs -16.2%
auction_2-2 1.158 ms 1.069 ms -7.7%
auction_2-3 1.445 ms 1.355 ms -6.2%
auction_2-4 1.160 ms 1.071 ms -7.7%
auction_2-5 508.1 μs 424.7 μs -16.4%
crowdfunding-success-1 455.2 μs 381.6 μs -16.2%
crowdfunding-success-2 454.8 μs 380.8 μs -16.3%
crowdfunding-success-3 454.9 μs 380.5 μs -16.4%
currency-1 468.4 μs 415.1 μs -11.4%
escrow-redeem_1-1 712.6 μs 623.1 μs -12.6%
escrow-redeem_1-2 712.9 μs 622.6 μs -12.7%
escrow-redeem_2-1 821.5 μs 661.8 μs -19.4%
escrow-redeem_2-2 819.9 μs 665.2 μs -18.9%
escrow-redeem_2-3 819.3 μs 663.5 μs -19.0%
escrow-refund-1 337.8 μs 286.8 μs -15.1%
future-increase-margin-1 469.3 μs 415.6 μs -11.4%
future-increase-margin-2 1.034 ms 941.0 μs -9.0%
future-increase-margin-3 1.038 ms 940.7 μs -9.4%
future-increase-margin-4 949.8 μs 873.5 μs -8.0%
future-increase-margin-5 1.393 ms 1.301 ms -6.6%
future-pay-out-1 470.4 μs 416.6 μs -11.4%
future-pay-out-2 1.038 ms 941.2 μs -9.3%
future-pay-out-3 1.036 ms 939.0 μs -9.4%
future-pay-out-4 1.384 ms 1.297 ms -6.3%
future-settle-early-1 468.9 μs 415.0 μs -11.5%
future-settle-early-2 1.032 ms 939.9 μs -8.9%
future-settle-early-3 1.034 ms 939.1 μs -9.2%
future-settle-early-4 1.094 ms 1.015 ms -7.2%
game-sm-success_1-1 787.4 μs 697.5 μs -11.4%
game-sm-success_1-2 430.5 μs 361.5 μs -16.0%
game-sm-success_1-3 1.165 ms 1.078 ms -7.5%
game-sm-success_1-4 506.2 μs 424.9 μs -16.1%
game-sm-success_2-1 787.2 μs 700.1 μs -11.1%
game-sm-success_2-2 430.3 μs 362.9 μs -15.7%
game-sm-success_2-3 1.167 ms 1.078 ms -7.6%
game-sm-success_2-4 506.5 μs 422.5 μs -16.6%
game-sm-success_2-5 1.170 ms 1.075 ms -8.1%
game-sm-success_2-6 506.8 μs 422.4 μs -16.7%
multisig-sm-1 797.0 μs 706.2 μs -11.4%
multisig-sm-2 780.8 μs 697.1 μs -10.7%
multisig-sm-3 789.8 μs 704.9 μs -10.7%
multisig-sm-4 798.9 μs 708.5 μs -11.3%
multisig-sm-5 1.033 ms 962.4 μs -6.8%
multisig-sm-6 794.5 μs 711.4 μs -10.5%
multisig-sm-7 779.5 μs 695.4 μs -10.8%
multisig-sm-8 787.7 μs 703.3 μs -10.7%
multisig-sm-9 795.1 μs 712.8 μs -10.4%
multisig-sm-10 1.030 ms 960.0 μs -6.8%
ping-pong-1 650.8 μs 580.1 μs -10.9%
ping-pong-2 651.0 μs 579.3 μs -11.0%
ping-pong_2-1 415.8 μs 357.9 μs -13.9%
prism-1 358.0 μs 299.5 μs -16.3%
prism-2 849.3 μs 764.8 μs -9.9%
prism-3 728.6 μs 643.6 μs -11.7%
pubkey-1 306.8 μs 257.1 μs -16.2%
stablecoin_1-1 1.623 ms 1.494 ms -7.9%
stablecoin_1-2 422.9 μs 354.3 μs -16.2%
stablecoin_1-3 1.854 ms 1.696 ms -8.5%
stablecoin_1-4 448.9 μs 376.8 μs -16.1%
stablecoin_1-5 2.352 ms 2.147 ms -8.7%
stablecoin_1-6 555.0 μs 466.9 μs -15.9%
stablecoin_2-1 1.622 ms 1.489 ms -8.2%
stablecoin_2-2 422.2 μs 352.9 μs -16.4%
stablecoin_2-3 1.845 ms 1.688 ms -8.5%
stablecoin_2-4 447.7 μs 374.8 μs -16.3%
token-account-1 369.0 μs 321.6 μs -12.8%
token-account-2 653.1 μs 575.1 μs -11.9%
uniswap-1 737.5 μs 677.5 μs -8.1%
uniswap-2 449.6 μs 388.0 μs -13.7%
uniswap-3 2.976 ms 2.725 ms -8.4%
uniswap-4 734.9 μs 617.1 μs -16.0%
uniswap-5 2.149 ms 1.931 ms -10.1%
uniswap-6 700.1 μs 586.0 μs -16.3%
vesting-1 670.5 μs 598.6 μs -10.7%

@effectfully
Copy link
Contributor Author

I'm quite puzzled as to how this branch is so ridiculously (and consistently) faster than the baseline. Below is the comparison of changes in generated Core that I was able to recognize.

In Cek.Internal:

master:

  case ((readKnown
           ($dKnownTypeIn_a2DXx `cast` <Co:8>)
           $dMonadError_s2MgS
           lvl3_r2Qd1
           (poly_$dAsEvaluationFailure_r2Qfe
            `cast` <Co:6>)
           (Just argTerm_s2MgX)
           w1_s2P4Y)
        `cast` <Co:9>)
         w2_s2P4Z
  of
  { (# ipv6_XHui, ipv7_XHuk #) ->

this PR:

  case readKnown
         ($dKnownTypeIn_a2CU1 `cast` <Co:8>)
         lvl3_r2Pc5
         (poly_$dAsEvaluationFailure_r2Pei
          `cast` <Co:6>)
         (Just argTerm_s2LfM)
         w1_s2O3N
  of {
    Left err_a2C84 ->
      jump exit6_X9P w2_s2O3O err_a2C84;
    Right x_a2C85 ->

In Default.Universe:

master:

-- RHS size: {terms: 88, types: 288, coercions: 55, joins: 0/0}
$fKnownTypeInDefaultUnitermInteger_$creadKnown
  = \ @ term_XMqM
      $dKnownBuiltinTypeIn_XMqO
      @ err_aMn8
      @ cause_aMn9
      @ m_aMna
      eta_B3
      eta1_B2
      _
      eta3_X3
      eta4_X7e ->
      case eq_sel
             ($p1(%,%) ($p1KnownBuiltinTypeIn $dKnownBuiltinTypeIn_XMqO))
      of co_aMOV
      { __DEFAULT ->
      case $dKnownBuiltinTypeIn_XMqO of
      { C:KnownBuiltinTypeIn ww1_s1bFg ww2_s1bFq ww3_s1bFr ww4_s1bFs ->
      case ww1_s1bFg of { (ww6_s1bFj, ww7_s1bFo) ->
      case ww6_s1bFj of { Eq# ww9_s1bFm ->
      case $p1MonadError eta_B3 of
      { C:Monad ww11_s1bFx ww12_s1bFy ww13_s1bFz ww14_s1bFA ->
      ww12_s1bFy
        ((($p1(%,%) ww7_s1bFo) `cast` <Co:2>)
           eta_B3 eta1_B2 eta3_X3 eta4_X7e)
        (\ v_Xd ->
           case v_Xd `cast` <Co:5> of { ValueOf uniAct_ahyQ x_ahyR ->
           case (ww3_s1bFr `cast` <Co:4>)
                  uniAct_ahyQ (ww4_s1bFs `cast` <Co:6>)
           of {
             Nothing ->
               throwError
                 eta_B3
                 <...>
             Just ds_dk8z ->
               case ds_dk8z of { Refl co1_aiT7 ->
               (pure ww11_s1bFx x_ahyR) `cast` <Co:4>
               }
           }
           })
      }
      }
      }
      }
      }

this PR:

-- RHS size: {terms: 83, types: 262, coercions: 59, joins: 0/0}
$fKnownTypeInDefaultUnitermInteger_$creadKnown
  = \ @ term_XLRG
      $dKnownBuiltinTypeIn_XLRI
      @ err_aLO1
      @ cause_aLO2
      eta_B2
      _
      eta2_X3
      eta3_X7d ->
      case eq_sel
             ($p1(%,%) ($p1KnownBuiltinTypeIn $dKnownBuiltinTypeIn_XLRI))
      of co_aMex
      { __DEFAULT ->
      case $dKnownBuiltinTypeIn_XLRI of
      { C:KnownBuiltinTypeIn ww1_s1b9l ww2_s1b9v ww3_s1b9w ww4_s1b9x ->
      case ww1_s1b9l of { (ww6_s1b9o, ww7_s1b9t) ->
      case ww6_s1b9o of { Eq# ww9_s1b9r ->
      case (($p1(%,%) ww7_s1b9t) `cast` <Co:2>)
             $fMonadErroreEither eta_B2 eta2_X3 eta3_X7d
      of {
        Left l_ikdU -> Left l_ikdU;
        Right r_ikdW ->
          case r_ikdW `cast` <Co:5> of { ValueOf uniAct_agR2 x_agR3 ->
          case (ww3_s1b9w `cast` <Co:4>)
                 uniAct_agR2 (ww4_s1b9x `cast` <Co:6>)
          of {
            Nothing ->
              Left <...>
            Just ds_djIl ->
              case ds_djIl of { Refl co1_aiuT -> (Right x_agR3) `cast` <Co:8> }
          }
          }
      }
      }
      }
      }
      }

And that's it.

Like yeah, this PR clearly compiles better than master, but how come it gives us a speedup of up to 19% in the entire evaluation time?

Anyway, I don't want to spend more time digging through Core, so I'll write some docs, ask for reviews and merge this.

@effectfully
Copy link
Contributor Author

One possible explanation would be that we somehow end up forcing the call to dischargeCekValue that is not supposed to be forced, however I was unable to spot that by looking at the generated Core and empirically it's not the case either.

Copy link
Contributor

@michaelpj michaelpj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My best hypothesis is that the second diff you show is the key one. In particular, it stops us from doing anything with the ST state token, except passing it to the continuation. Maybe that matters!

coerceVia _ = coerce
{-# INLINE coerceVia #-}

coerceArg :: Coercible a b => (a -> r) -> b -> r
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you hate eta-expansion so much? :p

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much do when it comes to performance-critical code. It's pretty common to call coerce over the entire function instead of just its argument, because that does not create any closures and in some cases makes the whole definition more sharing-friendly. I'll make a comment.

let runtime' = BuiltinRuntime schB (f x) . exF $ toExMemory arg
res <- evalBuiltinApp fun term' env runtime'
returnCek unbudgetedSteps ctx res
TypeSchemeArrow _ schB -> case readKnown (Just argTerm) arg of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be curious to see what happens if you do it this way:

do
  x <- liftEither $ readKnown ...
  ...

That might force us back into ST (if it doesn't get optimized away, which it probably will), which might be enlightening. And if it does get optimized away, it's arguably better style.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion, thank you.

@michaelpj
Copy link
Contributor

But actually, I can't think why it would matter. We replace one pattern match (to get the state token) with another one (to take apart the Either and call throwError). So it seems like no win. Maybe it's just that readKnown gets a faster implementation of >>=.

@effectfully effectfully force-pushed the effectfully/builtins/performance/monomorphize-readKnown branch from 1db9fbb to cc30b67 Compare January 4, 2022 17:32
@effectfully
Copy link
Contributor Author

/benchmark

@effectfully effectfully force-pushed the effectfully/builtins/performance/monomorphize-readKnown branch from cc30b67 to 3d1f229 Compare January 4, 2022 17:51
@iohk-devops
Copy link

Comparing benchmark results of 'c8c5183f7' (base) and 'cc30b671b' (PR)

Script c8c5183 cc30b67 Change
auction_1-1 384.2 μs 330.4 μs -14.0%
auction_1-2 1.153 ms 1.094 ms -5.1%
auction_1-3 1.159 ms 1.096 ms -5.4%
auction_1-4 508.3 μs 434.3 μs -14.6%
auction_2-1 387.5 μs 331.9 μs -14.3%
auction_2-2 1.158 ms 1.092 ms -5.7%
auction_2-3 1.450 ms 1.376 ms -5.1%
auction_2-4 1.161 ms 1.094 ms -5.8%
auction_2-5 509.4 μs 433.0 μs -15.0%
crowdfunding-success-1 456.9 μs 390.0 μs -14.6%
crowdfunding-success-2 457.4 μs 389.0 μs -15.0%
crowdfunding-success-3 458.5 μs 382.6 μs -16.6%
currency-1 469.3 μs 415.1 μs -11.5%
escrow-redeem_1-1 711.9 μs 624.9 μs -12.2%
escrow-redeem_1-2 712.7 μs 626.1 μs -12.2%
escrow-redeem_2-1 820.2 μs 667.6 μs -18.6%
escrow-redeem_2-2 819.4 μs 665.7 μs -18.8%
escrow-redeem_2-3 821.0 μs 665.6 μs -18.9%
escrow-refund-1 337.6 μs 287.6 μs -14.8%
future-increase-margin-1 469.0 μs 416.2 μs -11.3%
future-increase-margin-2 1.034 ms 940.4 μs -9.1%
future-increase-margin-3 1.038 ms 936.7 μs -9.8%
future-increase-margin-4 947.4 μs 874.7 μs -7.7%
future-increase-margin-5 1.389 ms 1.306 ms -6.0%
future-pay-out-1 470.0 μs 417.4 μs -11.2%
future-pay-out-2 1.038 ms 942.0 μs -9.2%
future-pay-out-3 1.041 ms 937.6 μs -9.9%
future-pay-out-4 1.398 ms 1.301 ms -6.9%
future-settle-early-1 471.4 μs 417.3 μs -11.5%
future-settle-early-2 1.038 ms 944.5 μs -9.0%
future-settle-early-3 1.037 ms 942.2 μs -9.1%
future-settle-early-4 1.093 ms 1.020 ms -6.7%
game-sm-success_1-1 788.0 μs 700.1 μs -11.2%
game-sm-success_1-2 430.4 μs 361.6 μs -16.0%
game-sm-success_1-3 1.169 ms 1.075 ms -8.0%
game-sm-success_1-4 507.7 μs 423.1 μs -16.7%
game-sm-success_2-1 788.3 μs 697.8 μs -11.5%
game-sm-success_2-2 431.3 μs 362.5 μs -16.0%
game-sm-success_2-3 1.166 ms 1.078 ms -7.5%
game-sm-success_2-4 505.2 μs 422.6 μs -16.3%
game-sm-success_2-5 1.168 ms 1.075 ms -8.0%
game-sm-success_2-6 504.3 μs 422.9 μs -16.1%
multisig-sm-1 795.3 μs 711.8 μs -10.5%
multisig-sm-2 779.9 μs 696.4 μs -10.7%
multisig-sm-3 790.7 μs 706.3 μs -10.7%
multisig-sm-4 799.1 μs 713.1 μs -10.8%
multisig-sm-5 1.037 ms 961.8 μs -7.3%
multisig-sm-6 797.8 μs 712.8 μs -10.7%
multisig-sm-7 780.0 μs 696.3 μs -10.7%
multisig-sm-8 793.1 μs 704.5 μs -11.2%
multisig-sm-9 799.8 μs 713.3 μs -10.8%
multisig-sm-10 1.038 ms 958.6 μs -7.6%
ping-pong-1 652.2 μs 579.7 μs -11.1%
ping-pong-2 650.9 μs 581.3 μs -10.7%
ping-pong_2-1 416.1 μs 357.9 μs -14.0%
prism-1 358.4 μs 298.1 μs -16.8%
prism-2 851.7 μs 760.0 μs -10.8%
prism-3 731.9 μs 639.9 μs -12.6%
pubkey-1 306.5 μs 255.7 μs -16.6%
stablecoin_1-1 1.625 ms 1.489 ms -8.4%
stablecoin_1-2 422.1 μs 352.5 μs -16.5%
stablecoin_1-3 1.856 ms 1.693 ms -8.8%
stablecoin_1-4 452.0 μs 376.4 μs -16.7%
stablecoin_1-5 2.370 ms 2.134 ms -10.0%
stablecoin_1-6 556.3 μs 466.9 μs -16.1%
stablecoin_2-1 1.622 ms 1.487 ms -8.3%
stablecoin_2-2 422.5 μs 353.4 μs -16.4%
stablecoin_2-3 1.848 ms 1.698 ms -8.1%
stablecoin_2-4 447.5 μs 376.1 μs -16.0%
token-account-1 367.0 μs 323.7 μs -11.8%
token-account-2 647.6 μs 577.4 μs -10.8%
uniswap-1 736.1 μs 680.5 μs -7.6%
uniswap-2 448.4 μs 389.4 μs -13.2%
uniswap-3 2.971 ms 2.728 ms -8.2%
uniswap-4 729.8 μs 616.7 μs -15.5%
uniswap-5 2.133 ms 1.934 ms -9.3%
uniswap-6 700.5 μs 586.9 μs -16.2%
vesting-1 669.0 μs 598.3 μs -10.6%

@effectfully
Copy link
Contributor Author

/benchmark

@iohk-devops
Copy link

Comparing benchmark results of 'c8c5183f7' (base) and '2a500e25b' (PR)

Script c8c5183 2a500e2 Change
auction_1-1 383.7 μs 322.9 μs -15.8%
auction_1-2 1.152 ms 1.067 ms -7.4%
auction_1-3 1.155 ms 1.072 ms -7.2%
auction_1-4 510.6 μs 425.9 μs -16.6%
auction_2-1 386.1 μs 325.4 μs -15.7%
auction_2-2 1.149 ms 1.072 ms -6.7%
auction_2-3 1.447 ms 1.352 ms -6.6%
auction_2-4 1.160 ms 1.071 ms -7.7%
auction_2-5 510.4 μs 424.3 μs -16.9%
crowdfunding-success-1 456.4 μs 381.3 μs -16.5%
crowdfunding-success-2 456.7 μs 379.7 μs -16.9%
crowdfunding-success-3 457.2 μs 380.4 μs -16.8%
currency-1 469.8 μs 415.2 μs -11.6%
escrow-redeem_1-1 711.2 μs 623.4 μs -12.3%
escrow-redeem_1-2 712.2 μs 625.8 μs -12.1%
escrow-redeem_2-1 820.1 μs 663.9 μs -19.0%
escrow-redeem_2-2 818.6 μs 663.1 μs -19.0%
escrow-redeem_2-3 819.9 μs 662.9 μs -19.1%
escrow-refund-1 339.1 μs 285.8 μs -15.7%
future-increase-margin-1 471.2 μs 417.5 μs -11.4%
future-increase-margin-2 1.039 ms 943.1 μs -9.2%
future-increase-margin-3 1.038 ms 940.2 μs -9.4%
future-increase-margin-4 948.8 μs 873.1 μs -8.0%
future-increase-margin-5 1.391 ms 1.304 ms -6.3%
future-pay-out-1 469.7 μs 417.2 μs -11.2%
future-pay-out-2 1.036 ms 941.7 μs -9.1%
future-pay-out-3 1.034 ms 937.9 μs -9.3%
future-pay-out-4 1.389 ms 1.298 ms -6.6%
future-settle-early-1 469.7 μs 416.9 μs -11.2%
future-settle-early-2 1.037 ms 941.2 μs -9.2%
future-settle-early-3 1.037 ms 938.1 μs -9.5%
future-settle-early-4 1.098 ms 1.014 ms -7.7%
game-sm-success_1-1 792.1 μs 699.6 μs -11.7%
game-sm-success_1-2 433.2 μs 362.7 μs -16.3%
game-sm-success_1-3 1.168 ms 1.081 ms -7.4%
game-sm-success_1-4 507.0 μs 424.0 μs -16.4%
game-sm-success_2-1 789.8 μs 698.1 μs -11.6%
game-sm-success_2-2 432.1 μs 361.8 μs -16.3%
game-sm-success_2-3 1.168 ms 1.077 ms -7.8%
game-sm-success_2-4 508.0 μs 421.8 μs -17.0%
game-sm-success_2-5 1.169 ms 1.075 ms -8.0%
game-sm-success_2-6 507.7 μs 421.5 μs -17.0%
multisig-sm-1 796.4 μs 707.7 μs -11.1%
multisig-sm-2 777.9 μs 695.6 μs -10.6%
multisig-sm-3 792.6 μs 702.3 μs -11.4%
multisig-sm-4 796.3 μs 709.3 μs -10.9%
multisig-sm-5 1.037 ms 955.7 μs -7.8%
multisig-sm-6 795.6 μs 707.3 μs -11.1%
multisig-sm-7 782.0 μs 693.1 μs -11.4%
multisig-sm-8 790.6 μs 702.6 μs -11.1%
multisig-sm-9 797.5 μs 710.7 μs -10.9%
multisig-sm-10 1.036 ms 960.8 μs -7.3%
ping-pong-1 652.4 μs 581.1 μs -10.9%
ping-pong-2 655.3 μs 580.9 μs -11.4%
ping-pong_2-1 418.5 μs 358.5 μs -14.3%
prism-1 360.4 μs 298.9 μs -17.1%
prism-2 852.5 μs 760.7 μs -10.8%
prism-3 731.5 μs 641.3 μs -12.3%
pubkey-1 307.1 μs 257.7 μs -16.1%
stablecoin_1-1 1.621 ms 1.484 ms -8.5%
stablecoin_1-2 423.0 μs 352.3 μs -16.7%
stablecoin_1-3 1.847 ms 1.689 ms -8.6%
stablecoin_1-4 450.8 μs 374.0 μs -17.0%
stablecoin_1-5 2.352 ms 2.128 ms -9.5%
stablecoin_1-6 557.6 μs 463.6 μs -16.9%
stablecoin_2-1 1.622 ms 1.481 ms -8.7%
stablecoin_2-2 421.8 μs 352.4 μs -16.5%
stablecoin_2-3 1.854 ms 1.694 ms -8.6%
stablecoin_2-4 450.4 μs 374.7 μs -16.8%
token-account-1 368.3 μs 322.5 μs -12.4%
token-account-2 653.1 μs 573.9 μs -12.1%
uniswap-1 737.9 μs 676.3 μs -8.3%
uniswap-2 448.4 μs 388.0 μs -13.5%
uniswap-3 2.978 ms 2.732 ms -8.3%
uniswap-4 733.6 μs 616.4 μs -16.0%
uniswap-5 2.135 ms 1.935 ms -9.4%
uniswap-6 698.7 μs 588.5 μs -15.8%
vesting-1 668.1 μs 599.4 μs -10.3%

@effectfully
Copy link
Contributor Author

/benchmark

@iohk-devops
Copy link

Comparing benchmark results of 'c8c5183f7' (base) and 'eb1706059' (PR)

Script c8c5183 eb17060 Change
auction_1-1 384.4 μs 323.5 μs -15.8%
auction_1-2 1.157 ms 1.061 ms -8.3%
auction_1-3 1.161 ms 1.067 ms -8.1%
auction_1-4 508.3 μs 424.2 μs -16.5%
auction_2-1 384.9 μs 326.0 μs -15.3%
auction_2-2 1.156 ms 1.070 ms -7.4%
auction_2-3 1.441 ms 1.353 ms -6.1%
auction_2-4 1.156 ms 1.072 ms -7.3%
auction_2-5 507.9 μs 425.2 μs -16.3%
crowdfunding-success-1 454.9 μs 381.5 μs -16.1%
crowdfunding-success-2 454.9 μs 381.1 μs -16.2%
crowdfunding-success-3 456.1 μs 381.8 μs -16.3%
currency-1 469.6 μs 415.8 μs -11.5%
escrow-redeem_1-1 712.9 μs 622.4 μs -12.7%
escrow-redeem_1-2 713.5 μs 622.8 μs -12.7%
escrow-redeem_2-1 821.3 μs 662.1 μs -19.4%
escrow-redeem_2-2 818.6 μs 662.9 μs -19.0%
escrow-redeem_2-3 821.8 μs 662.5 μs -19.4%
escrow-refund-1 336.4 μs 287.6 μs -14.5%
future-increase-margin-1 468.6 μs 416.1 μs -11.2%
future-increase-margin-2 1.041 ms 938.4 μs -9.9%
future-increase-margin-3 1.039 ms 942.4 μs -9.3%
future-increase-margin-4 951.9 μs 874.7 μs -8.1%
future-increase-margin-5 1.389 ms 1.303 ms -6.2%
future-pay-out-1 470.3 μs 417.3 μs -11.3%
future-pay-out-2 1.036 ms 938.6 μs -9.4%
future-pay-out-3 1.035 ms 935.4 μs -9.6%
future-pay-out-4 1.385 ms 1.299 ms -6.2%
future-settle-early-1 470.1 μs 416.3 μs -11.4%
future-settle-early-2 1.036 ms 938.9 μs -9.4%
future-settle-early-3 1.034 ms 939.2 μs -9.2%
future-settle-early-4 1.094 ms 1.013 ms -7.4%
game-sm-success_1-1 790.1 μs 698.9 μs -11.5%
game-sm-success_1-2 432.7 μs 361.6 μs -16.4%
game-sm-success_1-3 1.169 ms 1.075 ms -8.0%
game-sm-success_1-4 508.0 μs 424.5 μs -16.4%
game-sm-success_2-1 791.0 μs 698.5 μs -11.7%
game-sm-success_2-2 431.1 μs 362.5 μs -15.9%
game-sm-success_2-3 1.172 ms 1.078 ms -8.0%
game-sm-success_2-4 507.3 μs 423.7 μs -16.5%
game-sm-success_2-5 1.170 ms 1.076 ms -8.0%
game-sm-success_2-6 506.1 μs 422.4 μs -16.5%
multisig-sm-1 796.4 μs 707.6 μs -11.2%
multisig-sm-2 779.4 μs 693.7 μs -11.0%
multisig-sm-3 791.0 μs 703.8 μs -11.0%
multisig-sm-4 799.6 μs 709.0 μs -11.3%
multisig-sm-5 1.037 ms 956.0 μs -7.8%
multisig-sm-6 795.0 μs 709.7 μs -10.7%
multisig-sm-7 779.2 μs 696.2 μs -10.7%
multisig-sm-8 792.0 μs 703.7 μs -11.1%
multisig-sm-9 799.8 μs 710.2 μs -11.2%
multisig-sm-10 1.038 ms 958.1 μs -7.7%
ping-pong-1 652.2 μs 579.7 μs -11.1%
ping-pong-2 651.2 μs 580.2 μs -10.9%
ping-pong_2-1 416.6 μs 357.1 μs -14.3%
prism-1 360.2 μs 297.4 μs -17.4%
prism-2 855.1 μs 759.7 μs -11.2%
prism-3 732.4 μs 641.0 μs -12.5%
pubkey-1 306.5 μs 259.5 μs -15.3%
stablecoin_1-1 1.625 ms 1.494 ms -8.1%
stablecoin_1-2 424.0 μs 354.3 μs -16.4%
stablecoin_1-3 1.852 ms 1.694 ms -8.5%
stablecoin_1-4 449.3 μs 375.4 μs -16.4%
stablecoin_1-5 2.352 ms 2.135 ms -9.2%
stablecoin_1-6 555.3 μs 465.8 μs -16.1%
stablecoin_2-1 1.625 ms 1.485 ms -8.6%
stablecoin_2-2 423.1 μs 352.3 μs -16.7%
stablecoin_2-3 1.851 ms 1.684 ms -9.0%
stablecoin_2-4 448.9 μs 373.8 μs -16.7%
token-account-1 369.2 μs 322.2 μs -12.7%
token-account-2 653.7 μs 575.2 μs -12.0%
uniswap-1 738.4 μs 676.8 μs -8.3%
uniswap-2 449.5 μs 388.1 μs -13.7%
uniswap-3 2.983 ms 2.718 ms -8.9%
uniswap-4 734.1 μs 613.9 μs -16.4%
uniswap-5 2.151 ms 1.927 ms -10.4%
uniswap-6 697.3 μs 586.6 μs -15.9%
vesting-1 667.0 μs 600.2 μs -10.0%

@effectfully
Copy link
Contributor Author

It seemed like liftEither was slower for certain contracts, but after rerunning the benchmarks I got the same results that we get without liftEither, so given that I've added some docs, I'm pressing "merge".

@effectfully effectfully merged commit c876b15 into master Jan 5, 2022
@effectfully effectfully deleted the effectfully/builtins/performance/monomorphize-readKnown branch January 5, 2022 00:09
@effectfully effectfully requested a review from kwxm January 5, 2022 00:33
@effectfully
Copy link
Contributor Author

Oops, @kwxm, sorry, I forgot to ask for your review. Doing that now. I'll address comments in another PR.

@kwxm
Copy link
Contributor

kwxm commented Jan 6, 2022

I don't have much to say, except that the performance changes seem rather mysterious. Maybe we should try some detailed profiling on one of the examples with a big speedup to see if anything jumps out.

@kwxm
Copy link
Contributor

kwxm commented Jan 6, 2022

Also, looking at the top of Typed.hs I see

{-# LANGUAGE BlockArguments           #-}
{-# LANGUAGE ConstraintKinds          #-}
{-# LANGUAGE DataKinds                #-}
{-# LANGUAGE DefaultSignatures        #-}
{-# LANGUAGE FlexibleInstances        #-}
{-# LANGUAGE GADTs                    #-}
{-# LANGUAGE LambdaCase               #-}
{-# LANGUAGE MultiParamTypeClasses    #-}
{-# LANGUAGE OverloadedStrings        #-}
{-# LANGUAGE PolyKinds                #-}
{-# LANGUAGE QuantifiedConstraints    #-}
{-# LANGUAGE RankNTypes               #-}
{-# LANGUAGE StandaloneKindSignatures #-}
{-# LANGUAGE TypeApplications         #-}
{-# LANGUAGE TypeFamilies             #-}
{-# LANGUAGE TypeOperators            #-}
{-# LANGUAGE UndecidableInstances     #-}
{-# LANGUAGE UndecidableSuperClasses  #-}

{-# LANGUAGE StrictData               #-}

It's a pity we can't tell it which extensions we don't want...

@thealmarty
Copy link
Contributor

HLS told me that we don't need StandaloneKindSignatures. I'll remove it in my upcoming PR.

@effectfully
Copy link
Contributor Author

@thealmarty I'm using it in #4312, which I'm planning to get merged next, so don't bother.

MaximilianAlgehed pushed a commit to Quviq/plutus that referenced this pull request Mar 3, 2022
This gave us a speedup of up to 19% for unclear reasons.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants