Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

print.tbl_df() fixup #51

Merged
merged 37 commits into from
Jun 13, 2016
Merged

print.tbl_df() fixup #51

merged 37 commits into from
Jun 13, 2016

Conversation

krlmlr
Copy link
Member

@krlmlr krlmlr commented Mar 18, 2016

Now:

> memdb_frame(a=1)
Source:   query [?? x 1]
Database: sqlite 3.11.1 [:memory:]

      a
  <dbl>
1     1
> iris %>% tbl_df
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl>  <fctr>
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
5           5.0         3.6          1.4         0.2  setosa
6           5.4         3.9          1.7         0.4  setosa
7           4.6         3.4          1.4         0.3  setosa
8           5.0         3.4          1.5         0.2  setosa
9           4.4         2.9          1.4         0.2  setosa
10          4.9         3.1          1.5         0.1  setosa
.. (140 more rows, 150 total)

Part of #48.

Fixes #19. Fixes #21.

Also supports n_extra = 0 now.

@codecov-io
Copy link

codecov-io commented Mar 18, 2016

Current coverage is 100%

Merging #51 into master will not change coverage

@@           master   #51   diff @@
===================================
  Files          13    13          
  Lines         517   540    +23   
  Methods         0     0          
  Messages        0     0          
  Branches        0     0          
===================================
+ Hits          517   540    +23   
  Misses          0     0          
  Partials        0     0          

Powered by Codecov. Last updated by 64175a8...24f6a76

@krlmlr krlmlr force-pushed the feature/19-remove-ellipsis branch from 306d2fe to a2dd6b9 Compare March 19, 2016 20:33
@krlmlr krlmlr changed the title trunc_mat() omits dots if length known print.tbl_df() fixup Mar 19, 2016
@krlmlr
Copy link
Member Author

krlmlr commented Mar 19, 2016

I think this is the easiest solution: Don't print the "Source: " line for data frames at all. If necessary, dplyr can override print.tbl_df(). Also solves #21.

dim_desc() is now unused here and gone, will stay in dplyr.

@hadley: How do you like the new output?

@krlmlr
Copy link
Member Author

krlmlr commented Mar 21, 2016

@hadley: Do we decide this for the current CRAN release? Un-exporting dim_desc() for now anyway (#55).

@hadley
Copy link
Member

hadley commented Mar 21, 2016

I'd rather wait on this

@krlmlr
Copy link
Member Author

krlmlr commented Mar 21, 2016

Fine. Agree to release tibble? After that, I'll work on the dplyr PR to restore compatibility.

@hadley
Copy link
Member

hadley commented Mar 21, 2016

Yeah, I think we're good for release.

@krlmlr
Copy link
Member Author

krlmlr commented Mar 30, 2016

We should eventually agree on the output here, but there's time.

@hadley
Copy link
Member

hadley commented Apr 6, 2016

   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl>  <fctr>
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
5           5.0         3.6          1.4         0.2  setosa
6           5.4         3.9          1.7         0.4  setosa
7           4.6         3.4          1.4         0.3  setosa
8           5.0         3.4          1.5         0.2  setosa
9           4.4         2.9          1.4         0.2  setosa
10          4.9         3.1          1.5         0.1  setosa
... and 140 more rows (150 total)
... and 20 more variables (a <int>, b <dbl>, c <chr>, ...)

@hadley
Copy link
Member

hadley commented Apr 6, 2016

Or maybe

... with 150 rows total

@krlmlr
Copy link
Member Author

krlmlr commented Apr 9, 2016

@hadley: I wonder if it's worthwhile to limit the output about extra columns to one line. n_extra would go then, making #68 obsolete. CC @lionel-.

@krlmlr
Copy link
Member Author

krlmlr commented Apr 9, 2016

... or limit the effective number of lines permitted for the extra columns?

@lionel-
Copy link
Member

lionel- commented Apr 9, 2016

(reposting in the right PR)

Indeed I don't think I ever used the information displayed in "Variables not shown". This could probably be just "And 150 more <...>". When I need to check more columns that can be displayed I use names().

@krlmlr
Copy link
Member Author

krlmlr commented Apr 9, 2016

If we print that there are more column, one row is "spent", and can as well be filled with information that is useful sometimes, e.g., if only few columns are missing. But I agree we shouldn't bother spending more than one row on that.

@hadley
Copy link
Member

hadley commented Apr 9, 2016

I think it's important to list the extra variables and one line is too little

@lionel-
Copy link
Member

lionel- commented Apr 9, 2016

How about a parameter giving the number of lines the extra cols are allowed to take? This would make the output more predictible.

@krlmlr
Copy link
Member Author

krlmlr commented Apr 9, 2016

What would be the default number of lines? How about 3?

@lionel-
Copy link
Member

lionel- commented Apr 9, 2016

3 or 5 would be nice I think.

@krlmlr krlmlr force-pushed the feature/19-remove-ellipsis branch 2 times, most recently from 9080359 to 7ede87f Compare May 7, 2016 08:41
@krlmlr krlmlr force-pushed the feature/19-remove-ellipsis branch from 20f971b to 9ca96d0 Compare May 7, 2016 09:17
#> 9 2013 1 1 557 600 -3 838
#> 10 2013 1 1 558 600 -2 753
#> ... with 336,766 more rows
#> ... and 12 more variables (sched_arr_time <int>, arr_delay <dbl>, carrier
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just and 12 more variables: (i.e. drop the parens)

#> ... and 12 more variables (sched_arr_time <int>, arr_delay <dbl>, carrier
#> <chr>, flight <int>, tailnum <chr>, origin <chr>, dest <chr>, air_time
#> <dbl>, distance <dbl>, hour <dbl>, minute <dbl>, time_hour <time>)
#> ... with 336,766 more rows, and 12 more variables: sched_arr_time <int>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to update README before, this is the only change: Now extra rows and extra columns are shown in the same line. Not sure what's better.

but substitute by regular space afterwards

@hadley: I hope this is "portable enough" for our purposes -- the CRAN tests will show.
@krlmlr krlmlr force-pushed the feature/19-remove-ellipsis branch from 4b6e508 to 46bd4c8 Compare May 7, 2016 16:51
@hadley
Copy link
Member

hadley commented May 16, 2016

Thinking about this some more - I do really like having an initial one line summary that explains what the object is before going into the details. What about echoing the column summary and doing:

as_data_frame(iris)
#> <tibble [150 x 5]>
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>        <dbl>       <dbl>        <dbl>       <dbl>  <fctr>
#> ...

I think the key is to display the summary in a sufficiently visually distinct way that it doesn't need an empty line between it and the data. What do you think?

@krlmlr
Copy link
Member Author

krlmlr commented May 17, 2016

I think this could work, especially if we have colored output.

@krlmlr
Copy link
Member Author

krlmlr commented May 17, 2016

@hadley: PTAL. Now using obj_sum() to create and print coarse summary.

  • Show summary also in knit_print()?
  • Should obj_sum() for grouped data frames and SQL sources include grouping variables? What happens if the output is too wide?
  • Three-digit marks gone -- use in obj_sum()?

@hadley: Why does S3 lookup not work for helper functions in R CMD check (only in devtools::test() )? Would it help if the helpers were executed in an environment which is also on the search path?
@krlmlr krlmlr mentioned this pull request Jun 7, 2016
@krlmlr krlmlr merged commit dbd103d into master Jun 13, 2016
@krlmlr krlmlr deleted the feature/19-remove-ellipsis branch June 13, 2016 12:27
krlmlr pushed a commit that referenced this pull request Jun 13, 2016
- Reworked output: More concise summary, removed empty line, showing number of hidden rows and columns (#51).
- Link to the package documentation from the `tibble` help page (#82).
- Don't rely on `knitr` internals for testing (#78).
krlmlr pushed a commit that referenced this pull request Jul 4, 2016
Follow-up release.

- `tibble()` is no longer an alias for `frame_data()` (#82).
- Remove `tbl_df()` (#57).
- `$` returns `NULL` if column not found, without partial matching. A warning is given (#109).
- `[[` returns `NULL` if column not found (#109).

- Reworked output: More concise summary (begins with hash `#` and contains more text (#95)), removed empty line, showing number of hidden rows and columns (#51). The trailing metadata also begins with hash `#` (#101). Presence of row names is indicated by a star in printed output (#72).
- Format `NA` values in character columns as `<NA>`, like `print.data.frame()` does (#69).
- The number of printed extra cols is now an option (#68, @lionel-).
- Computation of column width properly handles wide (e.g., Chinese) characters, tests still fail on Windows (#100).
- `glimpse()` shows nesting structure for lists and uses angle brackets for type (#98).
- Tibbles with `POSIXlt` columns can be printed now, the text `<POSIXlt>` is shown as placeholder to encourage usage of `POSIXct` (#86).
- `type_sum()` shows only topmost class for S3 objects.

- Strict checking of integer and logical column indexes. For integers, passing a non-integer index or an out-of-bounds index raises an error. For logicals, only vectors of length 1 or `ncol` are supported. Passing a matrix or an array now raises an error in any case (#83).
- Warn if setting non-`NULL` row names (#75).
- Consistently surround variable names with single quotes in error messages.
- Use "Unknown column 'x'" as error message if column not found, like base R (#94).
- `stop()` and `warning()` are now always called with `call. = FALSE`.

- The `.Dim` attribute is silently stripped from columns that are 1d matrices (#84).
- Converting a tibble without row names to a regular data frame does not add explicit row names.
- `as_tibble.data.frame()` preserves attributes, and uses `as_tibble.list()` to calling overriden methods which may lead to endless recursion.

- New `has_name() (#102).
- Prefer `tibble()` and `as_tibble()` over `data_frame()` and `as_data_frame()` in code and documentation (#82).
- New `is.tibble()` and `is_tibble()` (#79).
- New `enframe()` that converts vectors to two-column tibbles (#31, #74).
- `obj_sum()` and `type_sum()` show `"tibble"` instead of `"tbl_df"` for tibbles (#82).
- `as_tibble.data.frame()` gains `validate` argument (as in `as_tibble.list()`), if `TRUE` the input is validated.
- Implement `as_tibble.default()` (#71, tidyverse/dplyr#1752).
- `has_rownames()` supports arguments that are not data frames.

- Two-dimensional indexing with `[[` works (#58, #63).
- Subsetting with empty index (e.g., `x[]`) also removes row names.

- Document behavior of `as_tibble.tbl_df()` for subclasses (#60).
- Document and test that subsetting removes row names.

- Don't rely on `knitr` internals for testing (#78).
- Fix compatibility with `knitr` 1.13 (#76).
- Enhance `knit_print()` tests.
- Provide default implementation for `tbl_sum.tbl_sql()` and `tbl_sum.tbl_grouped_df()` to allow `dplyr` release before a `tibble` release.
- Explicit tests for `format_v()` (#98).
- Test output for `NULL` value of `tbl_sum()`.
- Test subsetting in all variants (#62).
- Add missing test from dplyr.
- Use new `expect_output_file()` from `testthat`.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
4 participants