Int64*Uint64 #9292

Keno · 2014-12-10T04:35:19Z

This doesn't seem like the intended behavior. Are we accidentally trying to do the multiplication in Uint64?

julia> -1*uint64(1)
ERROR: InexactError()
 in * at promotion.jl:159
```?

The text was updated successfully, but these errors were encountered:

stevengj · 2014-12-10T05:04:12Z

promote_type(Int,UInt) == UInt.

I personally dislike this behavior, similar to the C behavior that unsigned*signed == unsigned (which is even worse — C unsigned is downright dangerous because there is no error if you multiply an unsigned integer by a negative one, it just wraps around to a positive result).

StefanKarpinski · 2014-12-10T05:13:32Z

@stevengj, what would you prefer? I'm pretty open to any kind of persuasive argument here and now is a good time to change this kind of thing.

stevengj · 2014-12-10T05:23:40Z

I would prefer Signed * Unsigned = Signed. Argument: in ordinary math, negative * positive = negative. Deviating from this just to get an extra 1 bit of width in the case of positive values seems unpleasant to me.

(In the days of 16-bit integers, 1 bit was a big deal and maybe this was worth it. Nowadays, it seems a poor tradeoff.)

stevengj · 2014-12-10T05:28:33Z

Google "mix unsigned signed C" for lots of examples of how following C's promotion rules leads to grief. As a result, a lot of people recommend that you just don't use unsigned types in C (google "don't use unsigned").

Keno · 2014-12-10T05:31:09Z

At the very least the current behavior is inconsistent:

julia> -1*uint32(1)
-1

ViralBShah · 2014-12-10T05:44:53Z

The argument presented earlier was that when you use unsigned integers, you have a good reason for doing so and you want them to persist.

I personally prefer Steve's suggestion here.

Keno · 2014-12-10T05:49:13Z

I think that argument is fine but it doesn't work well with the checked conversion from integers to unsigned integers.

StefanKarpinski · 2014-12-10T06:06:10Z

What I suggested elsewhere in passing was having Int * UInt = Int and checked for overflow. Thus, if you want to do unchecked arithmetic you'd have to explicitly cast one of the arguments, as in (a % UInt) * b or a * (b % Int). That's not too terribly inconvenient to write. The advantage I can see for choosing to return an Int rather than UInt is that it can be value-correct for smaller more common arguments, including when the signed argument is negative – at the cost of not being able to represent large positive products. I think that's a good tradeoff, personally.

rfourquet · 2014-12-10T06:08:43Z

UInt64 are integers modulo 64 with positive representatives, so a positive Int64 can be directly interpreted as one of them, it's less clear for a negative Int64 (but I would prefer unchecked conversion if this is faster).
I like that promote_type(Int,UInt) == UInt but would want also promote_type(Int,UInt32) == UInt32.

StefanKarpinski · 2014-12-10T06:17:32Z

@stevengj, I googled "mix unsigned signed C" as you said and found a few relevant links:

http://blog.regehr.org/archives/268 : not relevant – we do signed-unsigned comparisons correctly.
http://stackoverflow.com/questions/2084949/arithmetic-operations-on-unsigned-and-signed-integers: ditto.
http://stackoverflow.com/questions/50605/signed-to-unsigned-conversion-in-c-is-it-always-safe : more relevant, but could be mooted by doing the arithmetic checked and similar issues apply to adding an unsigned value to a large signed value.

This article http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/ phrases my argument quite nicely:

It is very easy to underflow the minimum value of an unsigned int (i.e. zero). It is much more difficult to underflow or overflow the minimum or maximum value of a signed int

So in general, I do agree that it is probably better to produced an Int for mixed UInt/Int arithmetic, but I'm not sure that most of the C horror stories are really relevant to the argument.

StefanKarpinski · 2014-12-10T06:19:13Z

I think that performance concerns are a bit of a false problem here – mixed type operations are not very common in high-performance code and if you really need that speed, you can just use a % Int to convert a UInt to Int with wrapping explicitly.

StefanKarpinski · 2014-12-10T06:25:08Z

Which would lead to a slightly philosophical question of why it's ok for Int/Int arithmetic to overflow but not Uint/Int arithmetic.

toivoh · 2014-12-10T08:13:40Z

+1 for Int * UInt to produce an Int.

stevengj · 2014-12-10T13:14:26Z

Where I have personally seen the current behavior as being undesirable is that often a Uint is used to represent the length of an array or a portion thereof, e.g. when an external C library returns a length as size_t. Suppose you want to access the array as A[offset + i * stride] where i = 0:length-1. i is a UInt if length is, so the code won't work if stride is negative. At least in Julia it will throw an exception. (In C, if you have 32-bit i and stride on a 64-bit machine, then the indices will wrap around to positive values and a segfault will ensue.)

(This came up in implementing FFTW, and is why we ended up using ptrdiff_t/intptr_t rather than size_t to represent transform sizes.)

StefanKarpinski · 2014-12-10T15:14:55Z

This seems to me to mostly be an argument for not using UInt to represent array lengths and offsets. As I've noted before, it has the unfortunate property that if you make a mistake somewhere and compute a negative length, there's no way to tell that you've erred. The very large value just has to be taken at face value and used, resulting in a segfault instead of an easy-to-check-for negative value. With 64-bit words, you really don't need all of those bits for array lengths or indices so giving up half of your range is not a big deal.

nalimilan · 2014-12-10T15:33:54Z

Yet it's true, as in @stevengj's example, that external C libraries may use size_t for array lengths, in part for historical reasons.

StefanKarpinski · 2014-12-10T16:34:46Z

So far the main argument that seems compelling for mixed UInt/Int arithmetic producing Int is that it allows value-correctness for more common arguments – non-huge positive and negative values – whereas returning UInt makes value-correctness impossible for all negative arguments.

StefanKarpinski · 2014-12-10T16:34:59Z

That is, however, quite a good argument, imo.

toivoh · 2014-12-10T18:06:28Z

Agreed. I feel there's an analogy with /(::Int, ::Int) producing a Float64 here - avoiding that functions produce unexpected results because you pad them an unexpected type of real. I guess the full analogy would include -(::UInt,::UInt) producing an Int.

eschnett · 2014-12-13T22:53:39Z

When you are using uint, you often have bit masks etc. where any bit may be set. Explicit overflow checks don't make sense since these would slow things down (and prevent vectorization), and converting e.g. 0x8000000U to signed int doesn't make sense either.

When mixed-signedness operations always convert to signed, then bit mask operations look very cumbersome:

((mask - uint(1)) << uint(3)) | uint(3)

since every literal has to be converted to unsigned.

If you are worried about unsigned ints coming from C, and if you want to treat them like signed ints -- then you may as well convert them explicitly to signed ints.

stevengj · 2015-01-06T16:03:55Z

@eschnett, I think it makes sense for bitwise operations (&, <<, |, $) to continue to promote to unsigned. To unset a bit I would normally recommend mask & ~0x1 rather than mask - 0x1, as the former does the correct thing even if the bit is already unset.

eschnett · 2015-01-06T18:25:33Z

@stevengj I was thinking more of bitpos = uint(4); value &= (1<<bitpos)-1, i.e. the ``-1` would create a mask, not modify a value.

eschnett · 2015-01-06T18:31:34Z

If a-b returns Int for UInt arguments, then (for symmetry) so should a+b. It would be quite surprising if a-b (for integers!) leads to a different result than a+(-b), in particular when people re-write expressions or introduce temporaries. Taking this argument to conclusion, a*b should also return Int.

No matter what convention, there's the danger of an overflow. What about simply disallowing all mixed, "dangerous" operations? Adding Int and UInt is probably rare, and people may as well have to explicitly specify what they expect to happen. If we add a special rule for literal constants (they can be used either with Int or UInt), then this should take all ambiguity and most of the inconvenience out of the game.

StefanKarpinski · 2015-01-07T20:52:18Z

No matter what convention, there's the danger of an overflow. What about simply disallowing all mixed, "dangerous" operations? Adding Int and UInt is probably rare, and people may as well have to explicitly specify what they expect to happen.

That's why I was suggesting that the results of all of these be Int but that they check for overflows. Although there is an argument to be made for disabling them altogether.

eschnett · 2015-01-08T00:33:28Z

My proposal also included treating integer literals in a special way, probably similar to the way math constants are currently handled: UInt + Literal = UInt and Int + Literal = Int. Such a rule should cover most cases of Int + UInt, and the conversions from a literal to either Int or UInt can be checked at compile-time.

ararslan · 2016-10-05T17:30:51Z

I'm with @stevengj here. I think if anywhere in your calculation you have an idea of sign, that should be preserved.

stevengj · 2017-06-22T19:33:00Z

(deleted offtopic question from @udion)

I looked into implementing this, and I agree with @JeffBezanson's comment from #15489. Implementing this as a change to the promotion rules turns out to be a big mess, because it screws up bitwise manipulations (e.g. in float.jl) where you want e.g. unsigned << signed to be unsigned. It looks like it is best implemented as methods for *, + etc.

rfourquet · 2017-06-23T09:03:27Z

if anywhere in your calculation you have an idea of sign, that should be preserved.

It's not totally true, as the simplest way to write a litteral is as an Int. And the same argument could be inversed: "if anywhere you have an idea of unsignedness, that should be preserved", my main use case being using unsigned numbers as bit fields, and incrementing an unsigned (x+1), or constructing more complex masks with multiplications, etc.

I see two main use cases:

an unsigned appears in a computation, coming e.g. from a library (as a length for example), when an Int would be more appropriate: in this case it seems preferable to have UInt*Int -> Int (etc.).
an unsigned is used on purpose in a computation: I would say mixed computations with signed integers should result in unsigned integers. If you have a class ModInt (cf. the examples directory), it wouldn't make sense to have Int*ModInt -> Int; and UInt precisely represents modular integers.

I would favor use case 2 (for use case 1, a conversion to Int can be explicitly written after a ccall to adapt to Julia style, like in "gmp.jl" for example, after all writing the ccall forces the user to know that an unsigned integer is being returned).

StefanKarpinski · 2017-06-25T22:59:11Z

Implementing this as a change to the promotion rules turns out to be a big mess, because it screws up bitwise manipulations (e.g. in float.jl) where you want e.g. unsigned << signed to be unsigned. It looks like it is best implemented as methods for *, + etc.

The result of a bitshift should have the same type as the left argument – where does promotion come into it? E.g.:

julia> 0x1 << 1
0x02

If there's somewhere we're not doing this, it should be fixed.

stevengj · 2017-06-26T12:56:13Z

@StefanKarpinski, it turned out that the problem I was initially running into was not due to bitshift. But I still think we probably want UInt results for bitwise operators like |?

StefanKarpinski · 2017-09-19T16:51:17Z

This is the core "interesting" part of the promotion table:

Ts = sort!(Base.uniontypes(Base.BitInteger), by=T->(!(T<:Signed),sizeof(T)));

julia> for (i, T) in enumerate(Ts), (j, S) in enumerate(Ts)
           i < j || continue
           (T <: Signed) ⊻ (S <: Signed) || continue
           P = promote_type(T, S)
           @printf "%-7s + %-7s => %-7s\n" T S P
       end
Int8    + UInt8   => Int64
Int8    + UInt16  => Int64
Int8    + UInt32  => Int64
Int8    + UInt64  => UInt64
Int8    + UInt128 => UInt128
Int16   + UInt8   => Int64
Int16   + UInt16  => Int64
Int16   + UInt32  => Int64
Int16   + UInt64  => UInt64
Int16   + UInt128 => UInt128
Int32   + UInt8   => Int64
Int32   + UInt16  => Int64
Int32   + UInt32  => Int64
Int32   + UInt64  => UInt64
Int32   + UInt128 => UInt128
Int64   + UInt8   => Int64
Int64   + UInt16  => Int64
Int64   + UInt32  => Int64
Int64   + UInt64  => UInt64
Int64   + UInt128 => UInt128
Int128  + UInt8   => Int128
Int128  + UInt16  => Int128
Int128  + UInt32  => Int128
Int128  + UInt64  => Int128
Int128  + UInt128 => UInt128

@JeffBezanson, could you run this on your little 32-bit machine for the 32-bit version of the table?

omus · 2017-09-20T21:07:33Z

@StefanKarpinski here's is what I get on 32-bit:

Int8    + UInt8   => Int32
Int8    + UInt16  => Int32
Int8    + UInt32  => UInt32
Int8    + UInt64  => UInt64
Int8    + UInt128 => UInt128
Int16   + UInt8   => Int32
Int16   + UInt16  => Int32
Int16   + UInt32  => UInt32
Int16   + UInt64  => UInt64
Int16   + UInt128 => UInt128
Int32   + UInt8   => Int32
Int32   + UInt16  => Int32
Int32   + UInt32  => UInt32
Int32   + UInt64  => UInt64
Int32   + UInt128 => UInt128
Int64   + UInt8   => Int64
Int64   + UInt16  => Int64
Int64   + UInt32  => Int64
Int64   + UInt64  => UInt64
Int64   + UInt128 => UInt128
Int128  + UInt8   => Int128
Int128  + UInt16  => Int128
Int128  + UInt32  => Int128
Int128  + UInt64  => Int128
Int128  + UInt128 => UInt128

StefanKarpinski · 2017-09-21T01:33:46Z

Here's the complete picture then:

type A	type B	32-bit	64-bit
`Int8`	`UInt8`	`Int32`	`Int64`
`Int8`	`UInt16`	`Int32`	`Int64`
`Int8`	`UInt32`	`UInt32`	`Int64`
`Int8`	`UInt64`	`UInt64`	`UInt64`
`Int8`	`UInt128`	`UInt128`	`UInt128`
`Int16`	`UInt8`	`Int32`	`Int64`
`Int16`	`UInt16`	`Int32`	`Int64`
`Int16`	`UInt32`	`UInt32`	`Int64`
`Int16`	`UInt64`	`UInt64`	`UInt64`
`Int16`	`UInt128`	`UInt128`	`UInt128`
`Int32`	`UInt8`	`Int32`	`Int64`
`Int32`	`UInt16`	`Int32`	`Int64`
`Int32`	`UInt32`	`UInt32`	`Int64`
`Int32`	`UInt64`	`UInt64`	`UInt64`
`Int32`	`UInt128`	`UInt128`	`UInt128`
`Int64`	`UInt8`	`Int64`	`Int64`
`Int64`	`UInt16`	`Int64`	`Int64`
`Int64`	`UInt32`	`Int64`	`Int64`
`Int64`	`UInt64`	`UInt64`	`UInt64`
`Int64`	`UInt128`	`UInt128`	`UInt128`
`Int128`	`UInt8`	`Int128`	`Int128`
`Int128`	`UInt16`	`Int128`	`Int128`
`Int128`	`UInt32`	`Int128`	`Int128`
`Int128`	`UInt64`	`Int128`	`Int128`
`Int128`	`UInt128`	`UInt128`	`UInt128`

StefanKarpinski · 2017-09-21T01:37:43Z

One observation is that this is a place where the dependence on Int makes the behavior very hard to reason about across 32-bit and 64-bit systems. I would be appealing to make these rules independent of platform word size so that there's just a single table for this kind of thing. Certainly the cases where whether the result is signed or unsigned depends on the system word size seem pathological and we should at the very least avoid those.

JeffBezanson · 2017-09-21T01:49:22Z

Agreed that the cases where signedness depends on system word size are most ripe for revision.

StefanKarpinski · 2017-09-21T17:08:11Z

The only reasonable platform-independent rules that I can come up with are:

Require explicit casting for mixed signedness operations.
Promote to a signed type that is big enough to hold both inputs.
Promote to a signed type that is as big as the larger of the two inputs.
Promote to an unsigned type that is as big as the larger of the two inputs.

StefanKarpinski · 2017-09-21T17:16:36Z

One observation is that a common pattern to deal with mixed-signedness code is to put local type annotations on the variables and then let implicit conversion ensure that the types are right, so we should consider that pattern when choosing a behavior and ideally pick a design that allows the storage "locations" to be typed and from there let the compiler work things out and generate efficient, correct code.

StefanKarpinski · 2017-09-21T17:52:06Z

From triage, we've decided to give a try to option 4.

StefanKarpinski · 2017-09-21T17:57:01Z

Another option:

Pick the type of the bigger argument, preferring the unsigned one if they're the same size.

- for different-size arguments, the larger type wins - otherwise the unsigned type wins

JeffBezanson · 2017-09-22T21:38:44Z

The remainder of this is part of #15489.

StefanKarpinski · 2017-09-22T22:39:30Z

See also #15489 and PR #23827 fixing it.

StefanKarpinski · 2021-10-14T21:16:27Z

Since there was a complaint today that not enough detail about the motivation for this decision was recorded, I thought I'd write a bit about it. There are annoying cases no matter which rule you pick, so we picked a rule that was simple to understand and works most of the time. We had previously tried to design rules so that the result value was always possible to represent in the promoted type, but that led to very complex rules that were quite hard for people to remember or reason about. So we picked a rule that was simple to understand and works most of the time:

The result type is the larger of the two argument types
If both argument types are the same size and differ in signedness, then the unsigned type wins

The thinking for the last rule is that since Int is the default integer type, if you went to the trouble of introducing UInt presumably you want to continue with that. There are cases where that’s not what you want, but there are also cases where that is what you want.

…ly in signedness. This choice was explained by Karpinski in JuliaLang#9292 (comment) but wasn't fully clear in the documentation.

Clarify that the unsigned type is promoted to, if the types differ only in signedness. This choice was explained by Karpinski in #9292 (comment) but wasn't fully clear in the documentation.

Clarify that the unsigned type is promoted to, if the types differ only in signedness. This choice was explained by Karpinski in JuliaLang#9292 (comment) but wasn't fully clear in the documentation.

stevengj added the needs decision A decision on this change is needed label Dec 10, 2014

toivoh closed this as completed Dec 10, 2014

toivoh reopened this Dec 10, 2014

StefanKarpinski added this to the 0.6.0 milestone Sep 13, 2016

JuliaLang deleted a comment from nalimilan Jun 22, 2017

stevengj mentioned this issue Jun 22, 2017

WIP: preserve signs for +/-/*/... between IntXX and UIntYY #22482

Closed

JeffBezanson self-assigned this Sep 21, 2017

JeffBezanson added a commit that referenced this issue Sep 21, 2017

simplify rules for mixed-signedness integer arithmetic (#9292)

736bfeb

- for different-size arguments, the larger type wins - otherwise the unsigned type wins

JeffBezanson mentioned this issue Sep 21, 2017

simplify rules for mixed-signedness integer arithmetic (#9292) #23811

Merged

JeffBezanson added a commit that referenced this issue Sep 22, 2017

simplify rules for mixed-signedness integer arithmetic (#9292) (#23811)

5af2021

- for different-size arguments, the larger type wins - otherwise the unsigned type wins

JeffBezanson closed this as completed Sep 22, 2017

StefanKarpinski mentioned this issue Nov 2, 2017

promote_type(Int16, Float16) and promote_type(Int32, Float32) produce lossy types #24435

Closed

johnnychen94 mentioned this issue Aug 1, 2019

mixing Fixed and Normed JuliaMath/FixedPointNumbers.jl#126

Closed

TakshDhabalia mentioned this issue Mar 11, 2023

Change in Documentation for Conversion-and-Promotion.md #48972

Closed

robertfeldt mentioned this issue Mar 11, 2023

Clarify that the unsigned type is promoted to, if the types differ only in signedness #48973

Merged

Int64*Uint64 #9292

Int64*Uint64 #9292

Comments

Keno commented Dec 10, 2014

stevengj commented Dec 10, 2014

StefanKarpinski commented Dec 10, 2014

stevengj commented Dec 10, 2014

stevengj commented Dec 10, 2014

Keno commented Dec 10, 2014

ViralBShah commented Dec 10, 2014

Keno commented Dec 10, 2014

StefanKarpinski commented Dec 10, 2014

rfourquet commented Dec 10, 2014

StefanKarpinski commented Dec 10, 2014

StefanKarpinski commented Dec 10, 2014

StefanKarpinski commented Dec 10, 2014

toivoh commented Dec 10, 2014

stevengj commented Dec 10, 2014

StefanKarpinski commented Dec 10, 2014

nalimilan commented Dec 10, 2014

StefanKarpinski commented Dec 10, 2014

StefanKarpinski commented Dec 10, 2014

toivoh commented Dec 10, 2014

eschnett commented Dec 13, 2014

stevengj commented Jan 6, 2015

eschnett commented Jan 6, 2015

eschnett commented Jan 6, 2015

StefanKarpinski commented Jan 7, 2015

eschnett commented Jan 8, 2015

ararslan commented Oct 5, 2016

stevengj commented Jun 22, 2017

rfourquet commented Jun 23, 2017

StefanKarpinski commented Jun 25, 2017

stevengj commented Jun 26, 2017

StefanKarpinski commented Sep 19, 2017

omus commented Sep 20, 2017

StefanKarpinski commented Sep 21, 2017

StefanKarpinski commented Sep 21, 2017 • edited Loading

JeffBezanson commented Sep 21, 2017

StefanKarpinski commented Sep 21, 2017

StefanKarpinski commented Sep 21, 2017

StefanKarpinski commented Sep 21, 2017

StefanKarpinski commented Sep 21, 2017

JeffBezanson commented Sep 22, 2017

StefanKarpinski commented Sep 22, 2017

StefanKarpinski commented Oct 14, 2021 • edited Loading

StefanKarpinski commented Sep 21, 2017 •

edited

Loading

StefanKarpinski commented Oct 14, 2021 •

edited

Loading