Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No precomp #181

Merged
merged 2 commits into from
Jun 27, 2023
Merged

No precomp #181

merged 2 commits into from
Jun 27, 2023

Conversation

chriselrod
Copy link
Collaborator

@chriselrod chriselrod commented Jun 27, 2023

Precompilation kills runtime performance because of Julia bugs.
Master:

julia> @benchmark matmul!($Cf64, $Af64, $Bf64)
BenchmarkTools.Trial: 3727 samples with 1 evaluation.
 Range (min  max):  258.678 μs  279.208 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     265.012 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   265.171 μs ±   1.905 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                     ▁▂▄▄▆█▆▆▇▇▇█▇▇▃▄▃▁▁                         
  ▂▁▁▁▁▂▂▁▂▃▂▃▃▃▄▄▅▅▇████████████████████▆▆▅▄▅▄▄▄▃▃▂▂▂▂▂▂▂▂▂▂▂▂ ▅
  259 μs           Histogram: frequency by time          272 μs <

 Memory estimate: 272 bytes, allocs estimate: 17.

PR:

julia> @benchmark matmul!($Cf64, $Af64, $Bf64)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  23.756 μs   43.892 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     24.462 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   24.547 μs ± 586.526 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

           ▁▃▅▆██▆▄▃▁                                           
  ▂▂▂▂▂▃▃▅▆██████████▇▅▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▁▂▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  23.8 μs         Histogram: frequency by time         26.7 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

@ranocha have you seen this?

This was the benchmark script I used:

julia> A = rand(-1_000:1_000, 200, 200);

julia> B = rand(-1_000:1_000, 200, 200);

julia> C = similar(A);

julia> @benchmark mul!($C, $A, $B)
BenchmarkTools.Trial: 411 samples with 1 evaluation.
 Range (min  max):  2.283 ms   12.646 ms  ┊ GC (min  max): 0.00%  77.02%
 Time  (median):     2.372 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.430 ms ± 515.259 μs  ┊ GC (mean ± σ):  0.98% ±  3.80%

         ██        ▁                                           
  ▅▂▁▁▁▂▃██▄▄▅▅▅▄▄▇█▅▃▄▆▄▃▂▃▁▂▁▂▁▁▂▁▂▃▁▁▁▃▁▂▂▂▃▃▃▃▂▁▂▂▁▄▅▃▁▃▂ ▃
  2.28 ms         Histogram: frequency by time        2.67 ms <

 Memory estimate: 30.77 KiB, allocs estimate: 4.

julia> @benchmark matmul!($C, $A, $B)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  90.987 μs  118.231 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     91.633 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   92.750 μs ±   2.644 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▃▇█▅    ▂▄▃▁ ▃▆▅▁     ▁▁                                     ▂
  ████▇▃▄▆████▇████▁▃▁▆███▇▆▅▅▁▁▃▁▁▁▁▃▁▁▁▁▃▁▁▁▁▁▁▁▅▆▆▅▄▆▅▆▆▇▆▄ █
  91 μs         Histogram: log(frequency) by time       106 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> 2.369e-3 / 91.302e-6
25.946857681102276

julia> Af64 = Float64.(A); Bf64 = Float64.(B); Cf64 = similar(Af64);

julia> @benchmark mul!($Cf64, $Af64, $Bf64)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  14.808 μs  34.542 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     15.224 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   15.367 μs ±  1.002 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

     ▄█▆▂                                                      
  ▂▃▆████▆▄▃▂▂▂▁▁▂▂▂▂▂▁▂▁▁▁▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▁▂▂▂▂ ▃
  14.8 μs         Histogram: frequency by time        19.7 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark matmul!($Cf64, $Af64, $Bf64)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  23.756 μs   43.892 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     24.462 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   24.547 μs ± 586.526 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

           ▁▃▅▆██▆▄▃▁                                           
  ▂▂▂▂▂▃▃▅▆██████████▇▅▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▁▂▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  23.8 μs         Histogram: frequency by time         26.7 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> Cf64 == C
true

@chriselrod chriselrod enabled auto-merge (squash) June 27, 2023 17:51
@codecov
Copy link

codecov bot commented Jun 27, 2023

Codecov Report

Patch coverage has no change and project coverage change: -0.47 ⚠️

Comparison is base (c63b5b9) 90.40% compared to head (9e0551e) 89.93%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #181      +/-   ##
==========================================
- Coverage   90.40%   89.93%   -0.47%     
==========================================
  Files          14       14              
  Lines        1146     1133      -13     
==========================================
- Hits         1036     1019      -17     
- Misses        110      114       +4     
Impacted Files Coverage Δ
src/Octavian.jl 50.00% <ø> (-50.00%) ⬇️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@chriselrod chriselrod requested a review from ranocha June 27, 2023 18:03
@chriselrod chriselrod merged commit dcc448d into master Jun 27, 2023
@chriselrod chriselrod deleted the noprecomp branch June 27, 2023 18:25
@ranocha
Copy link
Member

ranocha commented Jun 28, 2023

@ranocha have you seen this?

No - but I have to admit that I haven't been running a lot of matmul!-based code recently, focused on other parts of our codebase instead.

Thanks a lot for fixing this! Do you have an idea which Julia bugs are involved here?

@chriselrod
Copy link
Collaborator Author

Thanks a lot for fixing this! Do you have an idea which Julia bugs are involved here?

No, but bugs such as this have been known for a long time.
TriangularSolve.jl had to disable precompilation as well for this reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants