Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compact systematic display of values #51

Merged
merged 7 commits into from
Jan 28, 2017
Merged

Conversation

timholy
Copy link
Member

@timholy timholy commented Sep 15, 2016

Note: this proposal has been updated, please skip ahead to #51 (comment)

This introduces value printing of the following form:

julia> UFixed{UInt16, 11}(0.4)
0.4001U₁₁¹⁶

julia> Fixed{Int32,5}(0.4)
0.41F₅³²

Moreover, for UFixed8, UFixed10, ..., the same sequence of characters is interpreted as a constant that allows values to be reconstructed:

julia> x = UFixed10(0.4)
0.3998U₁₀¹⁶

julia> x == 0.3998U₁₀¹⁶
true

This has pros and cons:

  • Printing is systematic and comprehensive; any FixedPoint value has a self-explanatory compact representation
  • The con is that we might need "generated constants," i.e., we lack a way to have these parse automatically for arbitrary choices of T and f. Or some way of modifying julia's parser.

@timholy
Copy link
Member Author

timholy commented Sep 15, 2016

Maybe instead of F we should use Q: https://en.wikipedia.org/wiki/Fixed-point_arithmetic#Notation

@timholy
Copy link
Member Author

timholy commented Sep 15, 2016

It occurs to me that if we generated all typealiases (and yes, now I think they can be typealiases rather than consts), it would "only" be approx 256 exports. That's a lot, but perhaps not so many so as to make it untenable.

@vchuravy
Copy link
Collaborator

Can they be type aliases and be in the suffix form? I was thinking of having a major that would generate the const in the users namespace, but I like the idea of just generating most of them ;) (What about Uint128 based ones ;))

@timholy timholy force-pushed the teh/compact_printing branch from 80e602a to be32eff Compare September 16, 2016 02:22
@timholy timholy changed the title RFC/WIP: compact systematic display of values RFC: compact systematic display of values Sep 16, 2016
@timholy
Copy link
Member Author

timholy commented Sep 16, 2016

OK, an update here (I think this implementation is "complete"). The proposal is to print Fixed values like this:

julia> Fixed{Int16,7}(1.23)
1.227Q⁸₇

Here we're using the Q notation, where 8 is the number of magnitude bits and 7 the number of fractional bits. Note that 8+7=15, leaving one bit for the sign. I am not printing this as Q8.7, because we want the output representation to parse back to the actual value, and the . would prevent that from happening. So I introduced the super/subscripts to indicate the two numbers; I put the magnitude first and as a superscript, since it reflects the "bigger" (higher) part and also comes first.

Likewise, we'd print UFixed values like this:

julia> UFixed{UInt16,7}(1.23)
1.228U⁹₇

9 is the number of magnitude bits, and again 7 is the number of fractional bits; the UFixed numbers don't have a sign bit, so they'll always add to a twos-complement number. We can't use Q⁹₇ here because the normalization is different (2^f-1 rather than 2^f, see the README). The U is therefore something I just made up, but it seems reasonable.

In this implementation, these are typealiases for the actual type, and we export all possible typealiases up to 64 bits. I also (somewhat unconventionally) define * for a Real value and any of the Fixed/UFixed types, and use that as a constructor. This ensures that copy/paste of output value(s) reconstructs the same value(s).

Another consequence is that length(names(FixedPointNumbers)) = 267, which is a considerable number of exports.

On balance I like this proposal; the alternative is to print values as UFixed{UInt16,7}(1.228), which is quite a mouthful. One likely concern is how well the super/subscripts will render on different terminals.

@vchuravy
Copy link
Collaborator

In general I really like this, but the super and subscripts can be hard to read. I program in a font-size of 14, but even then it is not clearly visible. But +1 for more compact printing, but this will also require a prominent explanation in the readme.

@timholy
Copy link
Member Author

timholy commented Sep 16, 2016

Any good ideas for how to handle this without using super/subscripts? At the cost of an extra character, we could indicate the split with an additional unicode character. Perhaps U9⊚7 (http://www.fileformat.info/info/unicode/char/229a/index.htm) or U9⊘7 (http://www.fileformat.info/info/unicode/char/2298/index.htm). The choice seems pretty arbitrary to me, except of course for making sure we don't misuse something that already means something else.

@bjarthur
Copy link
Collaborator

you didn't entirely make up the U: "In addition, the letter U can be prefixed to the Q to indicate an unsigned value, such as UQ1.15, indicating values from 0.0 to +1.99997." (from the wiki article you linked).

@bjarthur
Copy link
Collaborator

what are the consequences of exporting so many names?

@bjarthur
Copy link
Collaborator

there is a convention to use B for byte (8 bits), W for word (16), D for double (32) and Q for quad (64). so one alternative to super/sub-scripts would be to say use B3 for UFixed{UInt8,3} and D7 for UFixed{UInt32,7}. for the signed counterparts one could add an S prefix: SQ11 is Fixed{Int64,11}. just thinking out loud.

@timholy
Copy link
Member Author

timholy commented Sep 16, 2016

you didn't entirely make up the U: "In addition, the letter U can be prefixed to the Q to indicate an unsigned value, such as UQ1.15, indicating values from 0.0 to +1.99997." (from the wiki article you linked).

True, but the normalization is still different. What they mean by UQ1.15 would be the equivalent of Fixed{UInt16,15} rather than UFixed{UInt16,15}. (We don't support the former yet, since Fixed requires Signed, but there's no reason we couldn't drop that requirement, except for the fear of adding confusion.) So UQ1.15 has a maximum value of

julia> 0xffff/0x8000
1.999969482421875

whereas

julia> typemax(UFixed{UInt16,15})
2.00003U¹₁₅

is equal to 0xffff/0x7fff.

In case this confuses you, the easy way to remember why we have this distinction is like this:

julia> 0xff/256  # This is typemax(UQ0.8)
0.99609375

and so convert(UQ0.8, 1.0) would throw an InexactError (it's the next "integer" up, so it doesn't round down to a valid number). For floating point images, "saturated" = 1.0, so for seamless interconversion between ("valid") floating-point and fixed-point we require that 1.0 lies within the domain of the fixed-point number type.

@timholy
Copy link
Member Author

timholy commented Sep 16, 2016

The B, W, D, Q suggestion is quite interesting. I was trying to pick something close to "standard" notation, but it's clear that standards don't cover all of our needs, so we are going to have to strike out on our own somewhat.

@timholy
Copy link
Member Author

timholy commented Sep 16, 2016

what are the consequences of exporting so many names?

I don't think it's terrible. It just seems like a lot for a small package. By comparison, length(names(Base)) = 1392 right now.

@bjarthur
Copy link
Collaborator

if you decide to go with BWDQ, i'd suggest instead to prepend with U for unsigned instead of S for signed, to mirror UInt and Int.

@vchuravy
Copy link
Collaborator

I would reserve UQ for the moment when we have Fixed{<:Unsigned}. I don't particularly care for W, D, Q since word means something different depending on the underlying architecture.
An alternative could be more to deviate from the default Q notation and use Q16₁₅. Where Q16 is Fixed{Int16} so this would be closer to how base works.

@timholy we could do a soft-rename in this case. Let's start using N for UFixed in anticipation for the rename to Norm/Normed (and I promise that I will get around to that soon.)

@bjarthur
Copy link
Collaborator

another alternative requiring little explanation would be UInt16F3==UFixed{UInt16,3}
and Int16F7 == Fixed{Int16,7}

@timholy
Copy link
Member Author

timholy commented Sep 16, 2016

@vchuravy, In that case, N definitely sounds like a good idea. I doubt there would ever be a "normalized signed" variant, because the existing Fixed{T<:Signed} types are already normalized in the negative direction (just not in the positive direction). So I don't think we need to have U in the name. (But chime in if you disagree.)

@bjarthur, I agree that being more explicit has some merits, but it's still pretty long compared to some of the choices we've been discussing. N9x7 (where x is to be determined) is only 4 characters, compared to 8 for UInt16F7.

One thing your example does convince me of, though, is that maybe it wouldn't be so terrible to use a roman character for x. I've been considering unicode mostly to avoid conflicting with common variable names, but really, how many programs use variables with names like what we're considering?

So perhaps N9f7 and Q8f7? The f is reminiscent of the 3.2f0 notation for constructing Float32s. One potential downside is that in f0, the 0 seemingly doesn't have a meaning (or at least, I'm not aware of one), whereas for us it would be very meaningful (and rarely, could indeed be 0). So there's some potential for confusion.

@vchuravy
Copy link
Collaborator

vchuravy commented Sep 16, 2016

No, I agree there won't be a signed variant of Normed. (Normalised???), so N should be sufficient.

I am having a hard time deciding between:

  • N16f8 == UFixed{UInt16, 8}
  • Q8f4 == Fixed{Int8, 4}
  • UQ16f10 == Fixed{UInt16,10}

and:

  • N8f8 == UFixed{UInt16, 8}
  • Q3f4 == Fixed{Int8, 4}
  • UQ6f10 == Fixed{UInt16, 10}

I find the first one easier to read, but the second one is more succinct (but requires more mental effort to parse)

@timholy
Copy link
Member Author

timholy commented Sep 16, 2016

I also initially went with the equivalent of Q8f4, and still find that somewhat more intuitive. But one fairly big problem is that's inconsistent with the conventional notation: https://en.wikipedia.org/wiki/Fixed-point_arithmetic#Notation. (I know they don't use f, but perhaps that's a small detail.)

Also, I think you have the opposite meaning of the number after the f that I've been using. I'm literally copying the value of f (the type parameter) in there. I think that's consistent with https://en.wikipedia.org/wiki/Fixed-point_arithmetic#Notation? "m = number of magnitude or integer bits, f = number of fraction bits" So I think their f is the same as our f. Their m = b - f - s.

@vchuravy
Copy link
Collaborator

ah I just made a mistake for UQ5f10. Yeah it is a conflict between the Julia convention of giving the full nbits of a type. and the fixed-point arithmetic notation.

@timholy
Copy link
Member Author

timholy commented Sep 16, 2016

I took the liberty of editing it to UQ6f10---there's no sign bit, so they need to add to 16. Assuming you concur, I think your second block describes my preferred names. What are your thoughts, @bjarthur?

@bjarthur
Copy link
Collaborator

i'm fine with the second block. it's short, and similar to two standards (the wiki link and floating point).

re. such a large namespace. perhaps we export just the even numbers of fractional bits?

@timholy
Copy link
Member Author

timholy commented Sep 17, 2016

OK, updated. For now I'm exporting all types, rather than just the even ones.

Two final questions:

  • Should UFixed8 and UFixed16 be special-cased as U8 and U16? (For printing as well as entering the type.) We define those typealiases in Color, and an awful lot of images are of type Gray{U8} and RGB{U8}. CC @mronian.
  • Should we delay merging this to master until @vchuravy changes the name to Normed? And possibly synchronize with the release of the new Images, since both of these will be disruptive changes to the FPN/Colors/Images ecosystem? Another argument to delay is the possibility that the final chosen name might undergo further evolution, e.g., [RFC/WIP] unify the concepts of Fixed and UFixed #32 (comment).

@timholy timholy force-pushed the teh/compact_printing branch from be32eff to 3e7e0c1 Compare September 17, 2016 07:07
print(io, typechar(X))
f = nbitsfrac(X)
m = sizeof(X)*8-f-signbits(X)
print(io, m, 'f', f)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be more performant to combine these two prints into one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be, I'll do that.

Fixed16,
UFixed8,
U8,
UFixed10,
UFixed12,
UFixed14,
UFixed16,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we deprecate UFixed16 and friends?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think so. Fixed16 too?

# literal constructor constants
U16,
# Q and U typealiases are exported in separate source files
# literal constructor constants
uf8,
uf10,
uf12,
uf14,
uf16,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deprecate uf16 and friends too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but those exist for a different reason---they are "repr constructors"---that's not covered by anything else:

julia> 0xc2uf8
0.761N0f8

julia> 0.761N0f8
0.761N0f8

julia> 0xc2N0f8
ERROR: InexactError()
 in *(::UInt8, ::Type{FixedPointNumbers.UFixed{UInt8,8}}) at /home/tim/.julia/v0.5/FixedPointNumbers/src/FixedPointNumbers.jl:63

Copy link
Collaborator

@bjarthur bjarthur Sep 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think the casual user of julia would find this intuitive. is there nothing to be done that would make that work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we really don't like these, perhaps we should get rid of them.

But 0xc2N0f8 or N0f8(0xc2) can never work as I think you're intending without causing absolute chaos. See JuliaGraphics/ColorTypes.jl#49 (comment). We need to firmly keep in mind the distinction between value and representation of that value, and generally one should just ignore the representation and think about the number in terms of its value. 0xc2 = 194 is far beyond the reach of a number representable with an N0f8, so if the constructor doesn't throw an error you've got a big problem.

I think the key is to (1) reliably throw errors when the user tries something that's not allowed (we didn't do that before #48 because of arithmetic overflow), and (2) provide better error messages that actually coach users through how to think about this/fix problems, along the lines of #49 (comment).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, maybe I'll deprecate 0xc2uf8 in favor of reinterpret(N0f8, 0xc2). Reasonable? Forcing people to use reinterpret explicitly might help clarify everyone's thinking.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doh! i was getting confused between value and representation just as you describe.

deprecating uf8 in favor of reinterpret sounds great.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've confused myself so many times I've lost track. But I think we're moving towards a solution that will be easier to keep straight!

typealias UFixed8 UFixed{UInt8,8}
typealias UFixed10 UFixed{UInt16,10}
typealias UFixed12 UFixed{UInt16,12}
typealias UFixed14 UFixed{UInt16,14}
typealias UFixed16 UFixed{UInt16,16}
typealias U16 UFixed{UInt16,16}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UFixed{UInt16,16} is already aliased as N0f16. i think that's short enough, and having another nomenclature would just add confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I largely agree with this. But should we apply the same logic to U8? In the Colors/Images world, we use RGB{U8} all the time. That would become RGB{N0f8}, which is not bad, but a little more to remember. (Probably I'm just not used to it, though.)

I also think it's valid to decide to keep U8 but ditch U16. Basically, I am persuadable to any outcome on this point 😄.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i'd suggest ditch U8 too. it's bad enough that signed fixed point ints are scaled differently than unsigned ones. we don't need multiple ways to specify the latter to make it even more confusing. N0f8 is already a huge improvement over the status quo.

Copy link

@mronian mronian Sep 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The U8 notation is pretty intuitive for someone coming from an image processing background since most packages use the general datatypes defined in their languages as the base Image type unlike Images.jl where we have FixedPointNumbers to abstract that out and give us one less thing to worry about 😄

If we are going to remove U8 and friends, then we need to support it with good documentation in the Images.jl packages (explaining how FixedPointNumbers works, why it helps and the notation). This does seem like a big change though so if you could also have a big warning message when you deprecate U8 (something that explains how to use the Nxfy notation too) it will really help users. Apart from this, I like the new implementation and even if there is a bit of a learning curve to it, I guess its worth it in the long run 😄

scaledual

reinterpret(x::FixedPoint) = x.i

# construction using the (approximate) intended value, i.e., 0.8U⁰₈
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't forget to change this comment to match whatever format we finally decide on

@timholy timholy force-pushed the teh/compact_printing branch 2 times, most recently from 9a908e7 to 1aa2ba9 Compare October 11, 2016 16:10
@timholy
Copy link
Member Author

timholy commented Oct 11, 2016

OK, I've rebased this and addressed review comments. I've deprecated everything in sight, but there's a chance we'll want to rethink some of these (particularly the U8 deprecation and the 0xa0uf8 deprecations). I'll try to collect feedback from Images users.

Also note: I added a shell script in the (new) contrib folder to assist with fixing deprecations.

@codecov-io
Copy link

codecov-io commented Oct 11, 2016

Current coverage is 82.58% (diff: 47.82%)

Merging #51 into master will decrease coverage by 5.83%

@@             master        #51   diff @@
==========================================
  Files             3          4     +1   
  Lines           164        178    +14   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            145        147     +2   
- Misses           19         31    +12   
  Partials          0          0          

Powered by Codecov. Last update ccb4267...223772d

@bjarthur
Copy link
Collaborator

bjarthur commented Nov 6, 2016

i believe ColorTypes.jl will have to be changed once this PR is merged to typealias U8 to N0f8

@timholy
Copy link
Member Author

timholy commented Nov 6, 2016

If you use imagebranch and switch to fixed-renaming, it's already done.

@timholy timholy force-pushed the teh/compact_printing branch from 223772d to fdf5506 Compare January 25, 2017 19:57
@timholy
Copy link
Member Author

timholy commented Jan 26, 2017

Heads-up @vchuravy, this will happen over the weekend. (ref #51 (comment))

This rename reflects the nature of `UFixed` it is not just a unsigned
fixpoint number type (for that you can use `Fixed{T<:Unsigned}`), but
rather a number type that is normalised to a specific `one`.
@timholy timholy changed the title RFC: compact systematic display of values Compact systematic display of values Jan 28, 2017
@timholy timholy merged commit 398e5dc into master Jan 28, 2017
@timholy timholy deleted the teh/compact_printing branch January 28, 2017 11:09
@timholy
Copy link
Member Author

timholy commented Jan 28, 2017

Bam. Let the chaos begin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants