Trailing insignificant digits not printed? #40

krlmlr · 2017-09-04T16:14:15Z

colformat::colformat(c(1000.34, 0.34567))
#>    <dbl>
#> 1000    
#>    0.346

@hadley: Is this intended?

hadley · 2017-09-05T13:55:03Z

Yes

krlmlr · 2017-09-05T13:59:54Z

Maybe we could print them if there's enough space?

hadley · 2017-09-05T14:12:37Z

It was a deliberate choice. Maybe it's worth rethinking, as it does seem a bit arbitrary to not display digits when space is available, and sigfigs are highlighted using colour so it's still scannable.

dpeterson71 · 2018-01-23T17:40:12Z

I believe the numbers should definitely be displayed in full if there's space, or at least otherwise notify the user that they have been modified. Wasn't one of the founding principles of plyr (and thus the genesis of the tidyverse in general) to not surprise the user (i.e. provide output consistent with input)? If I enter 1000.34 in my data entry, I certainly don't expect to see "1000".

hadley · 2018-01-23T17:47:46Z

@dpeterson71 what do you expect sqrt(2) ^ 2 to print?

dpeterson71 · 2018-01-23T18:01:49Z

In this case, sqrt(2)^2 should be just 2, as in base R. I would expect sqrt(2) to provide the precision I've requested by base-R's digits option. That's the crux of the problem, though, isn't it? The computer doesn't know a-priori whether I have entered in specific digits (or read them from a manually generated file) or computed something that could potentially be an irrational number.

If the computer is going to modify or change the data I have given it, it should at least have the courtesy to notify me that it has done so rather than blindly dropping information.

hadley · 2018-01-23T22:22:10Z

My point is that no floating point number is exact - I don't think it's unreasonable for tibble to not print .34 when it's only a small part of the value.

(BTW I don't like the principle of avoiding surprise; because different things surprise different people based on what they know)

huftis · 2018-01-24T08:28:17Z

I don’t if my opnion is worth much, but FWIW, I too find the current behaviour very misleading. It makes it look like there are no non-zero decimals (up to the precision/width used). I’m OK with hiding trailing zeros (up to the precision used), but hiding trailing non-zeros is confusing.

The current behaviour is:

pillar::pillar(c(1000.34, 1000, 0.34567))
#>    <dbl>
#> 1000    
#> 1000    
#>    0.346

I would be happy with this being rendered as either

#>    <dbl>
#> 1000.34    
#> 1000    
#>    0.346

or

#>    <dbl>
#> 1000.340    
#> 1000.000    
#>    0.346

But perhaps dropping the decimals could be restricted to integers (defined as numbers x where x == round(x)), e.g.:

pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
#>    <dbl>
#> 1000.340    
#> 1000.000    
#> 1000    
#>    0.346

or (preferably?)

pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
#>    <dbl>
#> 1000.34    
#> 1000.
#> 1000    
#>    0.346

That is, omitting the . indicates real integers. Or, in other words, having a decimal point is the formatting function telling the user ‘there is something after the decimal point – even though I might not display it (due to lack of space/precision)’.

hadley · 2018-01-24T14:29:07Z

I like the idea of using a trailing . to indicate that there's more there

dpeterson71 · 2018-01-24T16:25:47Z

@huftis has two very good suggestions, in my opinion. My personal preference would be the first example, where the entries for doubles are justified:

pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
  #>    <dbl>
  #> 1000.340    
  #> 1000.000    
  #> 1000    
  #>    0.346

The last option at least solves part of the issue where the careful observer might notice that the data has been modified by the subtle visual cue of a decimal point with missing digits. However, even though I could eventually learn to deal with that format, it is still harder to read and interpret with the uneven formatting and ragged edges. Our research group would never be allowed to present data that way in a public forum where readability and policy decisions matter.

dpeterson71 · 2018-01-24T16:49:38Z

One last thought. Cleveland's seminal work on visualizing data led to many improvements in graphing parameters and paradigms. The excellent lattice and ggplot2 packages make use of many of his concepts. Similarly, Brewer's extensive work in cartography and color theory guides optimal use of color in visualizations. I wonder if there exists some cognitive research on effective presentation of tabular data? If not, perhaps there's something for data that's analogous to the Chicago or Oxford Manuals of Style that could guide default format choices?

randomgambit · 2018-02-25T02:42:53Z

in my opinion this is extremely dangerous. I mean, I could honestly lose my job if I think I have 100 in my dataframe whereas I have 100.2

Formatting and color are fun, but this is way beyond that.

krlmlr · 2018-03-01T12:34:25Z

How do you like the current output with the decimal dot always printed?

randomgambit · 2018-03-01T15:46:42Z

This is what I get when I dont specify any option

> pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
   <dbl>
1000.   
1000.   
1000.   
   0.346

and now if I run

> options(pillar.sigfig=10)
> pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
         <dbl>
1.000340000e+3
1.000000078e+3
1.000000000e+3
3.456700000e-1

Damn... I just want to see my full number 1000.000078.
Lets try again

> options(pillar.sigfig=5)
> pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
     <dbl>
1000.3    
1000.0    
1000.0    
   0.34567

which is still rounding my numbers :(

How can I disable this rounding + forced scientific formatting altogether? Again, rounding numbers like this is misleading and dangerous (if enabled by default). Perhaps some users may like that, I am pretty sure most people wont.

Please let me know
Thanks!

randomgambit · 2018-03-01T16:19:33Z

actually setting pillar.sigfig = 7 seems to be a good compromise here. 👍

krlmlr · 2018-03-01T18:07:37Z

I'm glad that pillar.sigfig = 7 works for you:

data.frame(x = 1000.000078)
#>      x
#> 1 1000
sprintf("%.23f", 1000.000078)
#> [1] "1000.00007800000003044260666"

Created on 2018-03-01 by the reprex package (v0.2.0).

randomgambit · 2018-03-01T18:16:38Z

interesting. I think it would be worthwhile to educate the user about floating-point approximations here. Like you could share a link to http://floating-point-gui.de/basic/ on the main tibble page as a reminder/warning.

charliejhadley · 2018-03-02T12:43:10Z

@randomgambit I think it's wholly unfair to have folks need an understanding of floating point approximations in the beautification of tibble output. There's only once mention of floating points in the entirety of http://r4ds.had.co.nz/ and that's as wooly as possible.

randomgambit · 2018-03-02T13:51:28Z

@martinjhnhadley come on, seriously? anybody can understand that, the point would be to say - look - you can control the sigfig parameter and do all sort of funny stuff with color/shading. However, keep in mind that there is a physical limit on how accurate a number can be in the computer's memory. The reprex from @krlmlr is a nice example/reminder.

huftis · 2018-03-19T11:38:47Z

I was the one who proposed the ‘trailing decimal point’ feature, but FWIW, I’m not happy with the way it has been implemented. The idea was to use the dot to indicate that ‘there is more here, but we’re not displaying all of it (because of lack of space)’. But the way it’s implemented is to add a trailing decimal dot for all double numbers, regardless of whether they are integers (i.e. x %% 1 == 0, or x == round(x)).

So now only integer values are shown without a dot. I don’t think that’s useful, and it clutters the display of tibbles. To see if a number is an integer or a double, it’s enough to look at the column header, so the extra dot doesn’t add any information. And, at least in my experience, it’s very common that integer values are stored in numeric (double) columns.

I still think the original idea made sense. It’s useful to see if a number is an integer (not necessarily an integer) or if it has been truncated for display purposes. Having a trailing . shown only for truncated numbers (x %% 1 != 0) would give this information, and would make it easy to spot hard-to-find floating-point related issues (e.g. code that assumes that (.1 + .2) * 10 produces the number 3, something it doesn’t (it produces a number slightly larger than 3), but which R by default hides from you).

randomgambit · 2018-03-20T00:48:16Z

I really like the idea of the dot meaning there is more - but we dont see it. However, in practice, i will likely set enough significance digits so that I would always see a few digits in the decimal space. So that option would not impact me as much as the other ones.

krlmlr · 2018-04-09T00:48:25Z

Closing in favor of #105. The dot will be shown only if x %% 1 != 0.

github-actions · 2020-12-08T00:53:13Z

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

BruceZhaoR mentioned this issue Dec 27, 2017

Suggest tibble has an option of the sigfig tidyverse/tibble#344

Closed

This was referenced Jan 12, 2018

Indicate whether a value was truncated for displaying or not tidyverse/tibble#362

Closed

Display issue #93

Closed

charliejhadley mentioned this issue Jan 31, 2018

Feature Request: pillar.round = FALSE #97

Closed

This was referenced Feb 7, 2018

Don't print decimals if no value in vector has decimals. #62

Closed

summarize() give wrong results on particular vectors tidyverse/dplyr#3336

Closed

randomgambit mentioned this issue Feb 26, 2018

Thousand separator with color? #78

Closed

huftis mentioned this issue Mar 19, 2018

Is the decimal point for double values with no numbers after it necessary? #105

Closed

krlmlr closed this as completed Apr 9, 2018

mrchypark mentioned this issue Aug 21, 2018

글감 mrchypark/mrchypark.github.io#15

Closed

github-actions bot locked and limited conversation to collaborators Dec 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trailing insignificant digits not printed? #40

Trailing insignificant digits not printed? #40

krlmlr commented Sep 4, 2017

hadley commented Sep 5, 2017

krlmlr commented Sep 5, 2017

hadley commented Sep 5, 2017

dpeterson71 commented Jan 23, 2018

hadley commented Jan 23, 2018

dpeterson71 commented Jan 23, 2018

hadley commented Jan 23, 2018

huftis commented Jan 24, 2018

hadley commented Jan 24, 2018

dpeterson71 commented Jan 24, 2018

dpeterson71 commented Jan 24, 2018

randomgambit commented Feb 25, 2018

krlmlr commented Mar 1, 2018

randomgambit commented Mar 1, 2018

randomgambit commented Mar 1, 2018

krlmlr commented Mar 1, 2018

randomgambit commented Mar 1, 2018

charliejhadley commented Mar 2, 2018

randomgambit commented Mar 2, 2018

huftis commented Mar 19, 2018 •

edited

Loading

randomgambit commented Mar 20, 2018

krlmlr commented Apr 9, 2018 •

edited

Loading

github-actions bot commented Dec 8, 2020

Trailing insignificant digits not printed? #40

Trailing insignificant digits not printed? #40

Comments

krlmlr commented Sep 4, 2017

hadley commented Sep 5, 2017

krlmlr commented Sep 5, 2017

hadley commented Sep 5, 2017

dpeterson71 commented Jan 23, 2018

hadley commented Jan 23, 2018

dpeterson71 commented Jan 23, 2018

hadley commented Jan 23, 2018

huftis commented Jan 24, 2018

hadley commented Jan 24, 2018

dpeterson71 commented Jan 24, 2018

dpeterson71 commented Jan 24, 2018

randomgambit commented Feb 25, 2018

krlmlr commented Mar 1, 2018

randomgambit commented Mar 1, 2018

randomgambit commented Mar 1, 2018

krlmlr commented Mar 1, 2018

randomgambit commented Mar 1, 2018

charliejhadley commented Mar 2, 2018

randomgambit commented Mar 2, 2018

huftis commented Mar 19, 2018 • edited Loading

randomgambit commented Mar 20, 2018

krlmlr commented Apr 9, 2018 • edited Loading

github-actions bot commented Dec 8, 2020

huftis commented Mar 19, 2018 •

edited

Loading

krlmlr commented Apr 9, 2018 •

edited

Loading