Add max string length option #104

jankatins · 2016-06-20T01:27:53Z

If set to a number, strings longer than this number will be shortend with ....

E.g. `options(tibble.print_string_max = 10) will only print up to 10 characters
of each string.

if set to a number, strings longer than this number will be shortend with `...`. E.g. `options(tibble.print_string_max = 10) will only print up to 10 characters of each string.

codecov-io · 2016-06-20T01:41:57Z

Current coverage is 99.83%

Merging #104 into master will increase coverage by <.01%

@@             master       #104   diff @@
==========================================
  Files            14         14          
  Lines           590        595     +5   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            589        594     +5   
  Misses            1          1          
  Partials          0          0

Powered by Codecov. Last updated by 7c00abf...bf131a6

jankatins · 2016-06-20T01:44:57Z

Unfortunately, <data.frame [4 x 2]> is already 20 chars long...So the string formatting should probably done before the obj_sum function is applied...?

krlmlr · 2016-06-20T06:58:53Z

I'm not generally opposed to limiting string length, but I think #89 was about adaptive shortening that is applied as needed if the output is too wide.

jankatins · 2016-06-20T14:54:19Z

@krlmlr so the algo would be if printed_width > available_width: set string columns to max(nchar(names),min_string_length) and apply the "shortener" (=don't redistribute the leftover space), but this could potentially lead to a bit of leftover space. But that's already happening now...

I will do the changes, if that's what you want :-)

krlmlr · 2016-06-20T16:11:13Z

It would help to see an example, see the expect_output() calls in the tests.

jankatins · 2016-06-20T21:56:26Z

This is an example with the current implementation and options(tibble.print_string_max = 20)...

Source: local data frame [216 x 23]
Groups: tags [7]

# A tibble: 216 x 23
   longitude.x latitude.x                    date  company eyeColor   age                  guid.x index product
         <chr>      <chr>                   <chr>    <chr>    <chr> <int>                   <chr> <int>   <int>
1    27.339158 -85.314797 2000-03-17T23:45:01....  FLYBOYZ    brown    20 d3982b08-2a5e-4abc-a...     0       2
2    27.339158 -85.314797 2000-03-17T23:45:01....  FLYBOYZ    brown    20 d3982b08-2a5e-4abc-a...     0       2
3    27.339158 -85.314797 2000-03-17T23:45:01....  FLYBOYZ    brown    20 d3982b08-2a5e-4abc-a...     0       2
4   -72.514214  46.225443 2002-03-11T04:09:28....  BIOSPAN    brown    38 9afd44ba-5b0c-4302-8...     1       2
5   -72.514214  46.225443 2002-03-11T04:09:28....  BIOSPAN    brown    38 9afd44ba-5b0c-4302-8...     1       2
6   -72.514214  46.225443 2002-03-11T04:09:28....  BIOSPAN    brown    38 9afd44ba-5b0c-4302-8...     1       2
7   -31.964069  38.553859 1980-02-19T05:23:31....     ZOID    green    23 f1f4aa36-15e8-4208-a...     2       0
8   -31.964069  38.553859 1980-02-19T05:23:31....     ZOID    green    23 f1f4aa36-15e8-4208-a...     2       0
9   -31.964069  38.553859 1980-02-19T05:23:31....     ZOID    green    23 f1f4aa36-15e8-4208-a...     2       0
10  150.363342 -85.232234 1996-05-18T20:27:20.... PHOTOBIN     blue    37 b23bbf56-f6ae-4bb7-b...     3       1

krlmlr · 2016-06-22T00:13:08Z

Nice. If you could enhance the output tests, so that the "known output" becomes part of this PR? Just add a test to test-trunc-mat.r, the test file will be generated if it's missing.

See #100 for problems with "wide" characters in certain scripts. For instance, I'm getting:

> sprintf("%.*s...", 4, "合同录入日期")
[1] "合\xe5..."

We need a better method to shorten a string to a given "visible" width.

@hadley: Should string columns be limited to width 20 by default?

hadley · 2016-06-22T23:20:13Z

Shortening by default feels a bit too aggressive to me - I'd prefer to do only if there's not enough space. Ideally a one column df with long string would be truncated to screen width.

krlmlr · 2016-07-29T14:01:56Z

@JanSchulz: Would you like to contribute more to this PR? See #104 (comment).

jankatins · 2016-07-29T15:14:50Z

@krlmlr I should have time to work on this from wednesday next week... Sorry if that's too late :-/

Regarding the unicode stuff: is nchar(x, type = "width") the solutions? #100 mentions that this fails on windows?

So, to recap the requirements:

don't shorten if everything fits the space
one column should be shortend to exactly the screen width
min width of a column is max(nchar(names),min_string_length)
min_string_length is a setting

Algo:

if columns do not fit, set str columns to min width
calculate space and the maximum number of "fit in" (=visible) columns
calculate "leftover" space and redistribute to the string visible columns

krlmlr · 2016-07-30T21:47:50Z

Thanks. No hurry.

nchar(type = "width") works in RStudio and RGui, at least for the examples shown in #100. It works the same in the R terminal on Windows, but the output is printed unhelpfully as "<U+....>", which renders useless the width calculation. At some point we may need to invent our own nchar() that takes this into account, so it's probably a good idea to encapsulate the width calculation.

The "shortening to screen width" part looks interesting to me in the context of #100. If everything else fails, you could split the strings to code points and calculate the width for each; I'd generally exclude code points with zero width if the next codepoint doesn't fit:

> c("成交日期", "合同录入日期") %>% strsplit("") %>% lapply(nchar, type = "width")
[[1]]
[1] 2 2 2 2

[[2]]
[1] 2 2 2 2 2 2

Otherwise, the requirements look good to me.

krlmlr · 2016-07-30T21:50:28Z

Two more points:

You could use a Unicode ellipsis "\u2026", it renders as a dot in R terminal on Windows and as a single-char wide ellipsis elsewhere.
Are you going to look into shortening of column names, too?

jankatins · 2016-08-01T15:39:35Z

regarding windows <U+....> printing: r-lib/evaluate#66 This problem seems to happen quite deep in R. We had quite a lot of fun in IRkernel (or better in repr) with that because we use sink (or better evaluate does) to get the output of a computation: https://github.com/IRkernel/repr/blob/master/R/repr_matrix_df.r#L16-L27

krlmlr · 2016-08-19T18:18:24Z

Shelving this for now, I think strings with limited width will be easier with #144.

randomgambit · 2016-11-15T20:22:44Z

hello everyone, thanks for your great work!

Just wondering if there are any plans about implementing that? I think its pretty useful. For instance, in Pandas one could simply do:

In [43]: df = pd.DataFrame(np.array([['foo', 'bar', 'bim', 'uncomfortably long string'],
   ....:                             ['horse', 'cow', 'banana', 'apple']]))
   ....: 

In [44]: pd.set_option('max_colwidth',40)

In [45]: df
Out[45]: 
       0    1       2                          3
0    foo  bar     bim  uncomfortably long string
1  horse  cow  banana                      apple

In [46]: pd.set_option('max_colwidth', 6)

In [47]: df
Out[47]: 
       0    1      2      3
0    foo  bar    bim  un...
1  horse  cow  ba...  apple

In [48]: pd.reset_option('max_colwidth')

which is really helpful when one prints a tibble that contains both text and numeric values.

randomgambit · 2017-05-31T13:11:40Z

hello there! any updates on this? I tried options(tibble.print_string_max = 20) but does not seem to work. Thanks!

randomgambit · 2017-05-31T13:11:59Z

hello there! any updates on this? I tried options(tibble.print_string_max = 20) but does not seem to work. Thanks!

krlmlr · 2017-05-31T13:31:52Z

Development of better column formatting has moved to https://github.com/hadley/colformat. Would you like to contribute there?

Comments to a closed issue are not very effective, because it is easy to ignore them.

randomgambit · 2017-05-31T13:44:21Z

thanks @krlmrl ! i ll have a look at it

Add max string length option

bf131a6

if set to a number, strings longer than this number will be shortend with `...`. E.g. `options(tibble.print_string_max = 10) will only print up to 10 characters of each string.

jankatins mentioned this pull request Jun 20, 2016

Greedy printing? #89

Closed

krlmlr added the ready label Jun 20, 2016

krlmlr added in progress and removed ready labels Jun 22, 2016

krlmlr mentioned this pull request Aug 19, 2016

Re-encode character columns and column names to UTF-8 #87

Closed

krlmlr closed this Aug 19, 2016

krlmlr removed the in progress label Aug 19, 2016

jankatins mentioned this pull request Aug 22, 2016

New vector classes designed for better printing? #144

Closed

github-actions bot locked as resolved and limited conversation to collaborators Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add max string length option #104

Add max string length option #104

jankatins commented Jun 20, 2016

codecov-io commented Jun 20, 2016

jankatins commented Jun 20, 2016

krlmlr commented Jun 20, 2016

jankatins commented Jun 20, 2016

krlmlr commented Jun 20, 2016

jankatins commented Jun 20, 2016

krlmlr commented Jun 22, 2016

hadley commented Jun 22, 2016

krlmlr commented Jul 29, 2016

jankatins commented Jul 29, 2016

krlmlr commented Jul 30, 2016 •

edited

Loading

krlmlr commented Jul 30, 2016

jankatins commented Aug 1, 2016

krlmlr commented Aug 19, 2016

randomgambit commented Nov 15, 2016

randomgambit commented May 31, 2017

randomgambit commented May 31, 2017

krlmlr commented May 31, 2017

randomgambit commented May 31, 2017

Add max string length option #104

Add max string length option #104

Conversation

jankatins commented Jun 20, 2016

codecov-io commented Jun 20, 2016

Current coverage is 99.83%

jankatins commented Jun 20, 2016

krlmlr commented Jun 20, 2016

jankatins commented Jun 20, 2016

krlmlr commented Jun 20, 2016

jankatins commented Jun 20, 2016

krlmlr commented Jun 22, 2016

hadley commented Jun 22, 2016

krlmlr commented Jul 29, 2016

jankatins commented Jul 29, 2016

krlmlr commented Jul 30, 2016 • edited Loading

krlmlr commented Jul 30, 2016

jankatins commented Aug 1, 2016

krlmlr commented Aug 19, 2016

randomgambit commented Nov 15, 2016

randomgambit commented May 31, 2017

randomgambit commented May 31, 2017

krlmlr commented May 31, 2017

randomgambit commented May 31, 2017

krlmlr commented Jul 30, 2016 •

edited

Loading