Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression in performance of counting loops #5469

Closed
mlubin opened this issue Jan 22, 2014 · 9 comments
Closed

regression in performance of counting loops #5469

mlubin opened this issue Jan 22, 2014 · 9 comments
Assignees
Labels
performance Must go faster regression Regression in behavior compared to a previous version
Milestone

Comments

@mlubin
Copy link
Member

mlubin commented Jan 22, 2014

I was playing around with an example for the class I'm preparing and noticed a 30% performance regression in the following code from 0.2 to master:

function normAndMin(x)
    n = 0.0
    m = Inf
    for i in 1:length(x)
        n += x[i]*x[i]
        if x[i] < m
            m = x[i]
        end
    end
    return sqrt(n),m
end

x = rand(100_000_000)
normAndMin(x) # throw away
n,m = @time normAndMin(x)

Julia 0.2:
elapsed time: 0.101895311 seconds (6912 bytes allocated)

Julia master:
elapsed time: 0.135729463 seconds (6812 bytes allocated)

@JeffBezanson
Copy link
Member

This is certainly due to 0860767.

@mlubin
Copy link
Member Author

mlubin commented Jan 22, 2014

Any hope for improvement?

@JeffBezanson
Copy link
Member

Yes, this can be fiddled with, and we might overhaul integer ranges partly for this purpose.

@ghost ghost assigned JeffBezanson Jan 22, 2014
JeffBezanson added a commit that referenced this issue Mar 12, 2014
…t,len

this Range1 could be the UnitRange of #5585, with Range1 deprecated

also intended to address #5469 (performance)
@JeffBezanson
Copy link
Member

I have been digging deeply into this today. The way we lower iteration may be interfering with LLVM's ability to recognize loop idioms. The following code is slow:

    state = 1
    while state != l+1
        i = state
        state += 1
        n += x[i]*x[i]
        if x[i] < m
            m = x[i]
        end
    end

And simply moving state += 1 to the end is faster:

    state = 1
    while state != l+1
        i = state
        n += x[i]*x[i]
        if x[i] < m
            m = x[i]
        end
        state += 1
    end

@simonster
Copy link
Member

It looks like, with state += 1 at the beginning, the if is a branch, whereas with state += 1 at the end, it is optimized into a select. And indeed with:

    state = 1
    while state != l+1
        i = state
        state += 1
        n += x[i]*x[i]
        m = ifelse(x[i] < m, x[i], m)
    end

I seem to get the same performance as with state += 1 at the end.

@mlubin
Copy link
Member Author

mlubin commented Mar 13, 2014

Cool!

@JeffBezanson
Copy link
Member

Please verify. Worryingly, this varies a bit by machine, but it's either the same or faster on both machines I've tried so far.

@mlubin
Copy link
Member Author

mlubin commented Mar 13, 2014

Confirmed back to 0.2 timings on my machine.

@timholy
Copy link
Member

timholy commented Mar 13, 2014

Very nice.

JeffBezanson added a commit that referenced this issue Mar 31, 2014
…t,len

this Range1 could be the UnitRange of #5585, with Range1 deprecated

also intended to address #5469 (performance)
JeffBezanson added a commit that referenced this issue Apr 1, 2014
…t,len

this Range1 could be the UnitRange of #5585, with Range1 deprecated

also intended to address #5469 (performance)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

4 participants