Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: add support for CheckedArithmetic #146

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

timholy
Copy link
Member

@timholy timholy commented Nov 30, 2019

Here's a draft supporting CheckedArithmetic. Perhaps the key question is resolving the discussion in #143, determining which type should be used as the accumulatortype. For * and /, IMO the only candidate is floattype. We could use Normed{UInt, f} for + and floattype(T) for -, or use Normed{Int, f} for both + and - if we decide to go with #143.

@kimikage
Copy link
Collaborator

kimikage commented Dec 1, 2019

I'm against depending on CheckedArithmetic (at least in the master branch).

No, I was planning to make it a dependency of FixedPointNumbers. There is no way to have a common interface without having a common package.

We could split the @check and @checked out from the accumulatortype and acc, but it's such a small package currently that I don't see a huge advantage in slimming it down further.

Originally posted by @timholy in #41 (comment)

In my opinion, the first thing we should do is implement the checked and unchecked methods. There is a common interface Base.Checked.
Although I think there is room for improvement, we already have promotion interfaces for reduce:

const Treduce = Float64
Base.add_sum(x::FixedPoint, y::FixedPoint) = Treduce(x) + Treduce(y)
Base.reduce_empty(::typeof(Base.add_sum), ::Type{F}) where {F<:FixedPoint} = zero(Treduce)
Base.reduce_first(::typeof(Base.add_sum), x::FixedPoint) = Treduce(x)
Base.mul_prod(x::FixedPoint, y::FixedPoint) = Treduce(x) * Treduce(y)
Base.reduce_empty(::typeof(Base.mul_prod), ::Type{F}) where {F<:FixedPoint} = one(Treduce)
Base.reduce_first(::typeof(Base.mul_prod), x::FixedPoint) = Treduce(x)

@timholy
Copy link
Member Author

timholy commented Dec 1, 2019

In my opinion, the first thing we should do is implement the checked and unchecked methods.

Yes, you're right, I hadn't yet gotten to implementing the checked methods. Definitely an oversight.

I'm against depending on CheckedArithmetic (at least in the master branch).

Can you explain why? Unless we depend on CheckedArithmetic, then the algorithm-writer is going to have to know whether to call FixedPointNumbers.accumulatortype or ColorTypes.accumulatortype or something else. Of course we could effectively move CheckedArithmetic into FixedPointNumbers (i.e., FixedPointNumbers implements accumulatortype), but I am not sure I see the advantage in that, and the disadvantage is that it becomes an Images-only tool. Having it in an external package means it can become a common interface, essentially an extension of Julia itself.

@kimikage
Copy link
Collaborator

kimikage commented Dec 1, 2019

Can you explain why?

First of all, @checked and accumulatortype both start from a single argument, but they have different purposes and characteristics. The latter is reasonable but is not checked at all. Personally, I'm not happy with the accumulatortype which is not safe and does not necessarily contribute to the speedup. However, I don't think it's a bad thing to support it because some people find it useful.
Now, the reason is that CheckedArithmetic is a higher-level package. This is the same reason that FixedPointNumbers should not depend on Colors or Images. If we follow this reason formally, CheckedArithmetic should depend on FixedPointNumbers. Of course, I know there is a problem. So, we should have enough discussion before saying "there is no way".
Moreover, I have another somewhat emotional reason. You are proposing a drastic breaking change because of the convenience of CheckedArithmetic. Of course, I know that you are discussing properly as a developer or educator, and I am grateful for that. However, CheckedArithmetic (accumulatortype) is just one means. This biases the discussion.

@timholy
Copy link
Member Author

timholy commented Dec 1, 2019

Now, the reason is that CheckedArithmetic is a higher-level package.

So basically you're saying accumulatortype would need to be in its own package? We can arrange this. Might have been nice to have made this argument back when I posted #41 (comment), because now the package has been registered and tagged.

@kimikage
Copy link
Collaborator

kimikage commented Dec 1, 2019

So basically you're saying accumulatortype would need to be in its own package?

Generally speaking, it is better to modularize different things individually. However, it is another matter and I do not think it should be done right now. The new package containing accumulatortype will be still a higher-level package.

I didn't understand what accumulatortype was at that time, and I still do not understand. BigInt and BigFloat are "reasonable" as accumulatortype, aren't they? I don't think there's enough discussion about what the accumulatortype should be.

IMO, accumulatortypes should clearly specify their "safe area", and consider the size of arrays. ("specify" does not necessarily mean manifesting in the API.)

@timholy
Copy link
Member Author

timholy commented Dec 1, 2019

BigInt and BigFloat are "reasonable" as accumulatortype, aren't they?

Using the mysum benchmark in #143 (comment),

julia> @btime mysum(UInt(0), $A)
  8.320 ns (0 allocations: 0 bytes)
0x0000000000003566

julia> @btime mysum(BigInt(0), $A)
  6.048 μs (301 allocations: 4.72 KiB)
13670

Given that it's a 1000x slower, I don't think BigInt is a viable accumulatortype for anything other than BigInt element types.

UInt64 allows you to add 7e16 UInt8s without risk of overflow. On my laptop, this would require 5.7e8 seconds, equivalent to 18 years of computation. I think that's "safe" for all practical purposes. Likewise, N56f8 is a safe accumulatortype for N0f8.

For UInt16 (or 16-bit images), UInt64 it would allow one to sum a 560TB image file. Maybe I'll regret this soon, but for now I think that's as big as any practical images get.

As long as we cover the common element types used in image processing, I'd be fine with restricting accumulatortype to such element types, i.e., simply not define accumulatortype for N8f24 element types.

@kimikage
Copy link
Collaborator

kimikage commented Dec 1, 2019

I fully understand that, but that's not the answer I'm looking for. In fact, 18 years is extremely short in my industry.:laughing: In order to avoid such a barren discussion, formal specifications are necessary.

Edit:

If we follow this reason formally, CheckedArithmetic should depend on FixedPointNumbers. Of course, I know there is a problem.

For now, why does not CheckedArihmetic "incubate" the implementations for FixedPointNumbers as well as the basic types in Core or Base?
I think it's better to delegate to individual packages after the APIs are a little more mature. If each package implements accumulatortypes with its own rules, it will damage the usefulness of accumulatortype.

@kimikage
Copy link
Collaborator

kimikage commented Jul 15, 2020

This is mentioned in PR #143, etc., but as far as accumulation is concerned, it is faster and theoretically more accurate not to convert the elements to Float64.

function _sum(a::AbstractArray{N}) where N <: Normed{UInt8}
    if length(a) > 0x01010101
        mapreduce(reinterpret, +, a, init = UInt64(0)) / FixedPointNumbers.rawone(N)
    else
        mapreduce(reinterpret, +, a, init = UInt32(0)) / FixedPointNumbers.rawone(N)
    end
end
julia> x1 = collect(rand(N0f8, 4096, 4096));

julia> x2 = collect(rand(N0f8, 8192, 4096));

julia> @btime sum($x1);
  2.795 ms (0 allocations: 0 bytes)

julia> @btime sum($x2);
  5.688 ms (0 allocations: 0 bytes)

julia> @btime _sum($x1);
  933.800 μs (0 allocations: 0 bytes)

julia> @btime _sum($x2);
  2.624 ms (0 allocations: 0 bytes)

The reason I mentioned this PR after a long time is that, IIUC, we are moving towards making checked_* the default arithmetic. (I personally want wrapping_* to be the default, as Julia's default is, though.)
If it is theoretically clear that it will not overflow, performing overflow checking is obviously useless. The way to avoid the checked_* is to convert the elements to Float64, i.e. Treduce. As the length(a) > 0x01010101 above implies, using Float32 can easily lead to inaccurate results. I think that the error is acceptable for most applications, but I think it goes against the intention of using FixedPointNumbers with CheckedArithmetic.

@kimikage
Copy link
Collaborator

Now that we have the package extension mechanism and Preferences.jl, we need to revisit this.
(GitHub used to not have the ability to set a PR to draft.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants