performance on clamp #179

johnnychen94 · 2020-03-19T15:11:15Z

This is expected, but I think it can be optimized.

julia> @btime clamp!($(rand(N0f8, 512, 512)), 0.1, 0.9);
  813.628 μs (0 allocations: 0 bytes)

julia> @btime clamp!($(rand(N0f8, 512, 512)), 0.1N0f8, 0.9N0f8);
  628.583 μs (0 allocations: 0 bytes)

julia> @btime clamp.($(rand(N0f8, 512, 512)), 0.1, 0.9);
  691.843 μs (3 allocations: 2.00 MiB)

julia> @btime clamp.($(rand(N0f8, 512, 512)), 0.1N0f8, 0.9N0f8);
  1.146 ms (3 allocations: 256.19 KiB)

Ref: observed in JuliaImages/ImageContrastAdjustment.jl#28 (comment)

kimikage · 2020-03-19T22:47:18Z

I don't understand what the benchmarks above are intended for.:confused:

Is this what you mean?

julia> @btime clamp!($(zeros(N0f8, 512, 512)), 0.1, 0.9);
  505.900 μs (0 allocations: 0 bytes)

julia> @btime clamp!($(zeros(N0f8, 512, 512)), 0.1N0f8, 0.9N0f8);
  114.299 μs (0 allocations: 0 bytes)

julia> @btime clamp.($(zeros(N0f8, 512, 512)), 0.1, 0.9);
  687.300 μs (2 allocations: 2.00 MiB)

julia> @btime clamp.($(zeros(N0f8, 512, 512)), 0.1N0f8, 0.9N0f8);
  179.200 μs (2 allocations: 256.14 KiB)

Yes, I also think it can be optimized (specialized) as follows:

Base.clamp(x::X, lo::X, hi::X) where X<:FixedPoint = X(clamp(x.i, lo.i, hi.i), 0)

However, the bottleneck should be identified first. I don't think floating-point conversions are "too" slow. They are inherently expensive.

johnnychen94 · 2020-03-19T23:42:39Z

I was thinking a specification like this:

Base.clamp(x::X, lo, hi) where X<:FixedPoint = clamp(x, X(lo), X(hi))

I opened this as a potential issue tracker and haven't spent some time playing with it, feel free to close it if you think this makes non-sense.

kimikage · 2020-03-19T23:54:16Z

clamp! can be optimized, but I don't want to do it because there are many "in-place" variants.
clamp must not be changed as you suggested. FixedPoint is not a pixel type (e.g. Gray{N0f8}).

feel free to close it if you think this makes non-sense.

This issue clearly makes sense, but I don't know what you want.

kimikage · 2020-03-20T00:18:56Z

TBH, I had believed that the optimization for clamp(x::X, lo::X, hi::X) where X<:FixedPoint was done by the compiler (LLVM backend). I think it is worth introducing.:rocket:

johnnychen94 · 2020-03-20T00:30:33Z

Oh, I wrote this done as a quick note to myself and I didn't spend enough time playing with it, I'm sorry I didn't make it clear.

This is expected, but I think it can be optimized.

I was meant to say "I believe this performance gap is expected, but I think there're merits to proactively convert lo and hi to FixedPoint type because I believe the general philosophy is to not care about storage type since it's hard to intuit the types in practice, and when we try to tweak performance by converting/promoting types, there's a code smell to me.

I'm in a bit of a hurry catching up with my own ddls, I'll post the update later if I find anything worth optimizing.

clamp! can be optimized, but I don't want to do it because there are many "in-place" variants.

I don't quite understand the reasoning behind this, can you clarify it a bit? Thanks. My understanding is that in-place functions in Base are worth extending.

kimikage · 2020-03-20T01:08:27Z

clamp! can be optimized, but I don't want to do it because there are many "in-place" variants.

I don't quite understand the reasoning behind this, can you clarify it a bit?

The specialization of many "in-place" methods requires much labor and offers little benefit.

I believe the general philosophy is to not care about storage type

I agree with you. Therefore, I think clamp should be used when we can use clamp instead of clamp!. When we use clamp!, we must "definitely" pay attention to the storage type! The performance issues above are the result of "carelessness".:confused:

Also, if we follow the philosophy, clamp MUST NOT convert the "return" type.

timholy · 2020-03-20T11:05:34Z

Certainly the specialization in #179 (comment) makes sense. The one in #179 (comment) is much more dangerous, since currently clamp(x, -Inf, Inf) succeeds for all FixedPoint values but that definition would throw an error since Inf does not have a representation as FixedPoint. We could, however, define clamp_fast and make @fastmath do semantic replacement (this would presumably require a PR to Base). It might take a little work to get it to hoist the conversion for broadcasting, though.

Cases like @fastmath clamp.(A * B, lo, hi) would be trouble, though, if A and B don't have the same element type. This could get complex pretty quickly. The most robust approach is to pass lo and hi of the correct type manually.

kimikage · 2020-07-13T01:40:05Z

I merged PR #194. If we try to improve it further I will keep this issue open, otherwise I will close this issue.

johnnychen94 · 2020-07-13T04:21:50Z

Thanks for working on this. #194 looks like a good-enough improvement to me.

kimikage mentioned this issue Jun 22, 2020

Supplemental support for numerical and mathematical functions #185

Open

12 tasks

kimikage mentioned this issue Jul 8, 2020

Optimize clamp #194

Merged

johnnychen94 closed this as completed Jul 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance on clamp #179

performance on clamp #179

johnnychen94 commented Mar 19, 2020 •

edited

Loading

kimikage commented Mar 19, 2020 •

edited

Loading

johnnychen94 commented Mar 19, 2020 •

edited by kimikage

Loading

kimikage commented Mar 19, 2020 •

edited

Loading

kimikage commented Mar 20, 2020 •

edited

Loading

johnnychen94 commented Mar 20, 2020 •

edited

Loading

kimikage commented Mar 20, 2020 •

edited

Loading

timholy commented Mar 20, 2020 •

edited

Loading

kimikage commented Jul 13, 2020

johnnychen94 commented Jul 13, 2020

performance on clamp #179

performance on clamp #179

Comments

johnnychen94 commented Mar 19, 2020 • edited Loading

kimikage commented Mar 19, 2020 • edited Loading

johnnychen94 commented Mar 19, 2020 • edited by kimikage Loading

kimikage commented Mar 19, 2020 • edited Loading

kimikage commented Mar 20, 2020 • edited Loading

johnnychen94 commented Mar 20, 2020 • edited Loading

kimikage commented Mar 20, 2020 • edited Loading

timholy commented Mar 20, 2020 • edited Loading

kimikage commented Jul 13, 2020

johnnychen94 commented Jul 13, 2020

johnnychen94 commented Mar 19, 2020 •

edited

Loading

kimikage commented Mar 19, 2020 •

edited

Loading

johnnychen94 commented Mar 19, 2020 •

edited by kimikage

Loading

kimikage commented Mar 19, 2020 •

edited

Loading

kimikage commented Mar 20, 2020 •

edited

Loading

johnnychen94 commented Mar 20, 2020 •

edited

Loading

kimikage commented Mar 20, 2020 •

edited

Loading

timholy commented Mar 20, 2020 •

edited

Loading