Provide high performance limited functionality equivalent of tibble #350

hadley · 2018-01-04T20:50:13Z

i.e. with no evaluation or auto-naming - maybe something like tibbly <- function(...) as_tibble(list(...)) or even new_tibble(list(...)).

Or maybe we should encourage developers to use this idiom some other way?

The text was updated successfully, but these errors were encountered:

krlmlr · 2018-01-04T21:40:44Z

What is the use case, and the performance hit we're seeing with the current implementation? We can move parts to C for better performance. On that note: should we move lst() to rlang?

hadley · 2018-01-04T21:57:03Z

Use case is that you have a known "safe" data structure, and don't want to waste anytime checking (e.g. #353)

krlmlr · 2018-01-15T10:28:59Z

Yeah, there is a difference of an order of magnitude, but we're talking about microseconds vs. a millisecond. I wonder if this is really worth the risk and the effort:

library(tibble)
library(rlang)
df <- unclass(nycflights13::flights)

microbenchmark::microbenchmark(
  tibble(!!! df),
  invoke(tibble, df),
  tibble:::new_tibble(df),
  tibble:::new_tibble(df, nrow = 336776L)
)
#> Unit: microseconds
#>                                     expr      min        lq       mean
#>                        tibble(!(!(!df))) 1304.236 1348.9510 1809.99604
#>                       invoke(tibble, df) 1834.505 1950.2320 2208.62348
#>                  tibble:::new_tibble(df)   51.673   58.9760   73.82381
#>  tibble:::new_tibble(df, nrow = 336776L)   51.407   56.8365   64.79954
#>     median       uq       max neval cld
#>  1404.4650 1546.322 25915.788   100   b
#>  2042.5190 2142.852  5741.313   100   b
#>    63.5315   68.260   876.542   100  a 
#>    61.9530   65.871   127.549   100  a

Created on 2018-01-15 by the reprex package (v0.1.1.9000).

krlmlr · 2018-01-15T10:43:54Z

as_tibble(validate = FALSE) seems to be just fast enough (slowdown x2 with a wide data frame, mostly due to the length check which I'd rather keep). I'll add a reminder to update documentation to mention this:

library(tibble)
library(rlang)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
df <-
  nycflights13::flights %>% 
  count(origin, dest, month) %>%
  unite("relation", origin, dest, sep = "->") %>%
  spread(relation, n) %>% 
  unclass()

length(df)
#> [1] 225

microbenchmark::microbenchmark(
  tibble(!!! df),
  invoke(tibble, df),
  as_tibble(df),
  as_tibble(df, validate = FALSE),
  tibble:::new_tibble(df),
  tibble:::new_tibble(df, nrow = 12L)
)
#> Unit: microseconds
#>                                 expr       min         lq       mean
#>                    tibble(!(!(!df))) 30055.013 32404.4080 34550.9134
#>                   invoke(tibble, df) 34103.947 37124.7775 39586.9928
#>                        as_tibble(df)  1981.951  2141.3525  2528.6651
#>      as_tibble(df, validate = FALSE)   305.573   342.8565   409.1099
#>              tibble:::new_tibble(df)   172.964   195.4785   216.4160
#>  tibble:::new_tibble(df, nrow = 12L)   171.111   189.9820   220.2521
#>      median         uq       max neval  cld
#>  34036.5380 35426.0850 84756.116   100   c 
#>  39005.9890 40569.4315 69548.031   100    d
#>   2240.3870  2455.4070  5812.929   100  b  
#>    369.4780   397.9385  2820.020   100 a   
#>    207.7320   224.9585   440.641   100 a   
#>    204.1255   221.5070   429.062   100 a

Created on 2018-01-15 by the reprex package (v0.1.1.9000).

krlmlr · 2018-01-15T10:47:20Z

Also, new_tibble() is already exported.

github-actions · 2020-12-12T00:40:20Z

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

krlmlr closed this as completed Jan 15, 2018

krlmlr mentioned this issue Jan 15, 2018

Add documentation how to implement printing for a subclass #364

Closed

4 tasks

github-actions bot locked and limited conversation to collaborators Dec 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide high performance limited functionality equivalent of tibble #350

Provide high performance limited functionality equivalent of tibble #350

hadley commented Jan 4, 2018

krlmlr commented Jan 4, 2018

hadley commented Jan 4, 2018

krlmlr commented Jan 15, 2018

krlmlr commented Jan 15, 2018

krlmlr commented Jan 15, 2018

github-actions bot commented Dec 12, 2020

Provide high performance limited functionality equivalent of tibble #350

Provide high performance limited functionality equivalent of tibble #350

Comments

hadley commented Jan 4, 2018

krlmlr commented Jan 4, 2018

hadley commented Jan 4, 2018

krlmlr commented Jan 15, 2018

krlmlr commented Jan 15, 2018

krlmlr commented Jan 15, 2018

github-actions bot commented Dec 12, 2020