Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s2: Add AMD64 assembly for better mode #315

Merged
merged 10 commits into from
Feb 25, 2021
Merged

s2: Add AMD64 assembly for better mode #315

merged 10 commits into from
Feb 25, 2021

Conversation

klauspost
Copy link
Owner

@klauspost klauspost commented Feb 9, 2021

Blocks:

benchmark                              old ns/op     new ns/op     delta
BenchmarkTwainEncode1e1/better-32      10.7          10.5          -1.87%
BenchmarkTwainEncode1e2/better-32      2947          280           -90.50%
BenchmarkTwainEncode1e3/better-32      6664          2525          -62.11%
BenchmarkTwainEncode1e4/better-32      47401         25461         -46.29%
BenchmarkTwainEncode1e5/better-32      528060        417367        -20.96%
BenchmarkTwainEncode1e6/better-32      2137499       1554364       -27.28%

benchmark                                                  old ns/op     new ns/op     delta
BenchmarkRandomEncodeBetterBlock1MB-32                     39476         38241         -3.13%
BenchmarkEncodeS2Block/0-html/block-better-32              10140         6761          -33.32%
BenchmarkEncodeS2Block/1-urls/block-better-32              141170        90141         -36.15%
BenchmarkEncodeS2Block/2-jpg/block-better-32               1026          848           -17.35%
BenchmarkEncodeS2Block/3-jpg_200b/block-better-32          332           24.3          -92.68%
BenchmarkEncodeS2Block/4-pdf/block-better-32               12266         7164          -41.59%
BenchmarkEncodeS2Block/5-html4/block-better-32             14229         8134          -42.84%
BenchmarkEncodeS2Block/6-txt1/block-better-32              40537         27718         -31.62%
BenchmarkEncodeS2Block/7-txt2/block-better-32              35890         24783         -30.95%
BenchmarkEncodeS2Block/8-txt3/block-better-32              104525        77463         -25.89%
BenchmarkEncodeS2Block/9-txt4/block-better-32              144537        104121        -27.96%
BenchmarkEncodeS2Block/10-pb/block-better-32               9017          5427          -39.81%
BenchmarkEncodeS2Block/11-gaviota/block-better-32          31386         20973         -33.18%
BenchmarkEncodeS2Block/12-txt1_128b/block-better-32        312           16.4          -94.74%
BenchmarkEncodeS2Block/13-txt1_1000b/block-better-32       578           136           -76.47%
BenchmarkEncodeS2Block/14-txt1_10000b/block-better-32      3278          1293          -60.56%
BenchmarkEncodeS2Block/15-txt1_20000b/block-better-32      6469          3820          -40.95%

benchmark                                                  old MB/s      new MB/s      speedup
BenchmarkRandomEncodeBetterBlock1MB-32                     26562.09      27420.04      1.03x
BenchmarkEncodeS2Block/0-html/block-better-32              10098.47      15145.41      1.50x
BenchmarkEncodeS2Block/1-urls/block-better-32              4973.34       7788.75       1.57x
BenchmarkEncodeS2Block/2-jpg/block-better-32               119973.57     145200.76     1.21x
BenchmarkEncodeS2Block/3-jpg_200b/block-better-32          602.41        8241.97       13.68x
BenchmarkEncodeS2Block/4-pdf/block-better-32               8348.31       14293.26      1.71x
BenchmarkEncodeS2Block/5-html4/block-better-32             28786.61      50355.67      1.75x
BenchmarkEncodeS2Block/6-txt1/block-better-32              3751.82       5486.93       1.46x
BenchmarkEncodeS2Block/7-txt2/block-better-32              3487.81       5051.03       1.45x
BenchmarkEncodeS2Block/8-txt3/block-better-32              4082.81       5509.15       1.35x
BenchmarkEncodeS2Block/9-txt4/block-better-32              3333.82       4627.90       1.39x
BenchmarkEncodeS2Block/10-pb/block-better-32               13151.91      21850.98      1.66x
BenchmarkEncodeS2Block/11-gaviota/block-better-32          5872.67       8788.25       1.50x
BenchmarkEncodeS2Block/12-txt1_128b/block-better-32        410.38        7791.86       18.99x
BenchmarkEncodeS2Block/13-txt1_1000b/block-better-32       1729.19       7370.56       4.26x
BenchmarkEncodeS2Block/14-txt1_10000b/block-better-32      3050.66       7736.81       2.54x
BenchmarkEncodeS2Block/15-txt1_20000b/block-better-32      3091.47       5235.17       1.69x

Streams, With/without assembly, 16 cores:

github-june-2days-2019.json:
Compressing... 6273951764 -> 949146808 [15.13%]; 564ms, 10608.7MB/s
Compressing... 6273951764 -> 950079555 [15.14%]; 722ms, 8287.1MB/s

github-ranks-backup.bin:
Compressing... 1862623243 -> 555069246 [29.80%]; 261ms, 6805.8MB/s
Compressing... 1862623243 -> 555617002 [29.83%]; 384ms, 4625.9MB/s

enwik9:
Compressing... 1000000000 -> 426854233 [42.69%]; 229ms, 4164.5MB/s
Compressing... 1000000000 -> 427660256 [42.77%]; 333ms, 2863.9MB/s

nyc-taxi-data-10M.csv:
Compressing... 3325605752 -> 954776589 [28.71%]; 491ms, 6459.4MB/s
Compressing... 3325605752 -> 960330423 [28.88%]; 608ms, 5216.4MB/s

sharnd.out.2gb:
Compressing... 2147483647 -> 2147487753 [100.00%]; 174ms, 11770.0MB/s
Compressing... 2147483647 -> 2147487753 [100.00%]; 172ms, 11907.1MB/s

@klauspost klauspost marked this pull request as ready for review February 19, 2021 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant