Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LV keeps crushing during benchmarking #447

Closed
mzy2240 opened this issue Dec 1, 2022 · 5 comments
Closed

LV keeps crushing during benchmarking #447

mzy2240 opened this issue Dec 1, 2022 · 5 comments

Comments

@mzy2240
Copy link

mzy2240 commented Dec 1, 2022

The following script runs well, however, when benchmarking using BenchmarkTools, it just keeps crushing:

using BenchmarkTools
using LoopVectorization

A = rand(100,100)
A_inv = inv(A)
b = rand(100,1)
c = rand(1,100)
A1_complete = hcat(vcat(A,c), vcat(b,1))
ref = inv(A1_complete)

function jdotavx(a, b)
    s = zero(eltype(a))
    @turbo for i  eachindex(a, b)
        s += a[i] * b[i]
    end
    return s;
end

function rdot(A,b)
    s = zero(b)
    @turbo for j  axes(A, 2)
        for i  axes(A, 1)
            s[i] += A[i,j] * b[j]
        end
    end
    return s;
end

function cdot(c,A)
    s = zero(c)
    @turbo for i  axes(A, 1)
        for j  axes(A, 2)
            s[j] += A[i,j] * c[i]
        end
    end
    return s;
end

function fast_inverse(result, A_inv, b, c; d=1)
    n = size(A_inv, 1)
    dt_n = size(b, 2)
    e = rdot(A_inv, b)
    f = cdot(c, A_inv)
    g = inv(d - jdotavx(c,e))
    @turbo @. result[1:n, 1:n] = A_inv + g*e*f
    @turbo @. result[1:n, n+1:n+dt_n] = -g*e
    @turbo @. result[n+1:n+dt_n, 1:n] = -g*f
    result[end, end] = g
end
result = zeros(101,101)
fast_inverse(result, A_inv, b, c)  # it runs without any issue

@benchmark fast_inverse(@view(result[:,:]), A_inv, b, c)  # this one just keeps crushing in the notebook

I also attach the screenshot in case it is useful to you. Let me know if you need anything else. Thanks!
image

@mzy2240
Copy link
Author

mzy2240 commented Dec 1, 2022

Julia Version 1.8.2
Commit 36034abf26 (2022-09-29 15:21 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 64 × Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, cascadelake)
  Threads: 32 on 64 virtual cores
Environment:
  JULIA_NUM_THREADS = 32

@chriselrod
Copy link
Member

chriselrod commented Dec 1, 2022

@turbo broadcasting currently doesn't support dynamically broadcasting a contiguous dimension.

That is, a normally contiguous dimension should either be known to be of size 1 at compile time, or of full size.

It is possible to make this efficient, but I've never had the time.
I'll probably do that at some point in the rewrite.

@mzy2240
Copy link
Author

mzy2240 commented Dec 1, 2022

Thank you for the quick reply! Surprisingly, it gives correct result and won't crush, as long as it is not being benchmarked. And it could be benchmarked without any issues in my MBP...

@chriselrod
Copy link
Member

I'd have to take a closer look.
It doesn't crash for me:

julia> @benchmark fast_inverse(@view($result[:,:]), $A_inv, $b, $c)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):   9.902 μs  39.703 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     13.838 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   14.702 μs ±  3.534 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▂▃ ▂    ▅▆▆▇█▄▄▃▁▁                        ▁        ▅   ▁▂ ▂
  ██████▇█▇▇██████████▇▇▄▄▅▇█▇▆▁▃▆▄▅▄▁▄▁▄▇▆▅▄▃█▇▆▆▃▇▄▄▃█▆▃███ █
  9.9 μs       Histogram: log(frequency) by time      26.5 μs <

 Memory estimate: 1.75 KiB, allocs estimate: 2.

julia> versioninfo()
Julia Version 1.9.0-DEV.1805
Commit 050f21edc4 (2022-11-10 22:44 UTC)
Platform Info:
  OS: Linux (x86_64-redhat-linux)
  CPU: 28 × Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake-avx512)
  Threads: 28 on 28 virtual cores

Can you try commenting out the @turbo @. result[n+1:n+dt_n, 1:n] = -g*f line to see if it makes a difference?

@mzy2240
Copy link
Author

mzy2240 commented Dec 1, 2022

Stop crushing after changing result[end, end] = g to @turbo @. result[n+1:n+dt_n, n+1:n+dt_n] = g. Don't know why but it just works.

@mzy2240 mzy2240 closed this as completed Dec 1, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants