Clarify how to extend tibbles #275

hadley · 2017-07-05T14:00:34Z

By following the principles defined at http://adv-r.hadley.nz/s3.html#inheritance

Provide a constructor that can also be used to create subclassed objects

new_tibble <- function(data, ..., subclass = NULL) {
  stopifnot(is.data.frame(data))
  
  structure(
    data,
    ...,
    class = c(subclass, "tibble", "tbl_df", "data.frame")
  )
}

Provide a reconstruct method

 reconstruct.tibble <- function(new, old) {
   new_tibble(new)
 }

Call S3::reconstruct() in all methods that return a tibble

This supersedes #155, #211, #218.

The text was updated successfully, but these errors were encountered:

hadley · 2017-07-05T14:01:18Z

Also connected to making print methods more extensible.

hadley · 2017-10-04T20:44:53Z

We should verify that these changes are helpful by looking into how it affects googledrive.

DavisVaughan · 2017-11-06T22:30:39Z

For what it's worth, I rewrote most of tibbletime to use sloop::reconstruct(). It seems to work for most cases, but I also needed a maybe_reconstruct() function that just returns a tibble and not a tbl_time object (the class I'm building on top of tibble) if certain conditions aren't met after the dplyr function is run.

For example, I rely on having a date/datetime index column in the tibble. If that column is lost, there is no reason to be a tbl_time object. So when calling transmute(), if the resulting tibble doesn't have that index column, then I shouldn't call reconstruct() but should instead just return that tibble.

Also, I'm sure you're aware but we would need a way to extend grouped_df too. Both Thomas in tidygraph and I use grouped_tbl_graph and grouped_tbl_time style objects so a formal way to extend them would be neat.

krlmlr · 2017-11-07T08:36:14Z

Thanks. I wonder if you could implement your reconstruct() method accordingly, so that your class is added only if conditions are met? Can you point me to your implementation?

Verbs that operate on grouped data also should call reconstruct() at the end.

I'll postpone writing this vignette until after CRAN release of tibble + pillar, because otherwise we'll need to wait for sloop as well.

DavisVaughan · 2017-11-07T14:02:17Z

I think you're right that I could put the checks inside reconstruct() rather than include a separate function. They are cheap and quick checks so it wouldn't be a problem. You can find the reconstruct() function here. The appropriate new_*() functions are here.

My main question on grouped data is a bit convoluted, but bare with me. No rush on needing an answer.

Assume I have created a tbl_time object from a tbl_df object. Inheritance:
c("tbl_time", "tbl_df", "tbl", "data.frame")

Now, I want to additionally group the data by a column. I assume the inheritance should look like:
c("grouped_tbl_time", "tbl_time", "grouped_df", "tbl_df", "tbl", "data.frame")

The way I do this, attempting to follow the new Advanced R material, is as follows:

Call group_by.tbl_time(). For this to work, I return the data to just a tbl_df() and call the group_by() for that. Then I construct my grouped_tbl_time on top of the result. I don't think reconstruct() is appropriate here, because the original .data is not a grouped_df

group_by.tbl_time <- function(.data, ..., add = FALSE) {
  # Normal group then pass to grouped_tbl_time helper
  quos <- rlang::quos(...)
  .data_grouped <- dplyr::group_by(as_tibble(.data), !!! quos, add = add)
  grouped_tbl_time(.data_grouped, !! get_index_quo(.data))
}

The above code calls the grouped_tbl_time() helper, which finds the pieces necessary to create the tbl_time object, and calls new_grouped_tbl_time(args).
new_grouped_tbl_time() then calls new_tbl_time(args, ..., subclass = "grouped_tbl_time")
This is where I get confused. At this stage, new_tbl_time() has to decide between calling new_grouped_df() or new_tibble(). The way I do this is to check if we are working with a grouped_df() at the moment or not. (I have my own versions of new_grouped_df() and new_tibble() at the bottom of the page here.)

Is that right? I don't see any other way to get the inheritance correct unless maybe I have new_grouped_tbl_time() instead call new_tbl_time(args, ..., subclass = c("grouped_tbl_time", "grouped_df"). The inheritance would then be:
c("grouped_tbl_time", "grouped_df", "tbl_time", "tbl_df", "tbl", "data.frame")
but maybe that makes more sense?

krlmlr · 2017-11-12T21:53:05Z

I wonder if you can get away with simply removing the group_by.tbl_time() method and implementing reconstruct.tbl_time() to add the "grouped_tbl_time" class if the result has the "grouped_df" class.

#' @export
reconstruct.tbl_time <- function(new, old) {
  if ("grouped_df" %in% class(new)) class(new) <- c("grouped_tbl_time", class(new))
  ...
  new
}

On a side note, it seems that you can simply pass along ... to dplyr::group_by() in your (maybe unneeded) implementation of group_by.tbl_time().

DavisVaughan · 2017-11-12T23:39:45Z

I think your suggestion removes a lot of unnecessary work on my end. I really appreciate you taking a look at it.

I saw your latest PR, removing ... in new_tibble(). Aren't the ellipses exactly how someone like me might extend a tibble's attributes? The essence of new_tbl_time() is (before your new update):

new_tbl_time <- function(x, index_quo, index_time_zone, ..., subclass = NULL) {
    tibble::new_tibble(
      x,
      index_quo       = index_quo,
      index_time_zone = index_time_zone,
      subclass        = c(subclass, "tbl_time")
    )
}

Where index_quo and index_time_zone are passed through the ... to become attributes.

I guess the alternative is for me to assign the attributes to x in new_tbl_time(), then pass x to new_tibble()?

Support from Advanced R (not trying to take this as gospel by bringing it up so often, its just what ive been going off and it seems to make sense to me).

"To allow subclasses, the parent constructor needs to have ... and subclass arguments" -here

It provides the below example, where the new z attribute is added through the ....

new_my_class <- function(x, y, ..., subclass = NULL) {
  stopifnot(is.numeric(x))
  stopifnot(is.logical(y))
  
  structure(
    x,
    y = y,
    ...,
    class = c(subclass, "my_class")
  )
}

new_subclass <- function(x, y, z) {
  stopifnot(is.character(z))
  new_my_class(x, y, z, subclass = "subclass")
}

krlmlr · 2017-11-12T23:44:34Z

Fine with adv-r as the reference ;-) . I looked in a previous subsection, https://adv-r.hadley.nz/s3.html#constructors, and I haven't found the ellipsis there. It's easy to add back, though.

DavisVaughan · 2017-11-12T23:57:30Z

I'm actually surprised he doesn't do it there. In theory new_Date() should (I think?) be extensible so people could build on top of it. new_Date() and new_s3_dbl() are both in sloop, and it would be easy to do:

new_Date <- function(x, ..., subclass = NULL) {
  sloop::new_s3_dbl(x, ..., class = c(subclass, "Date"))
}

rather than just:

new_Date <- function(x) {
  sloop::new_s3_dbl(x, class = "Date")
}

My guess is that he is just introducing the topic for the first time there in 13.2.1 of Advanced R, and it might be overwhelming to show everything (inheritance) all at once. Which is why the ... and subclass are introduced later in 13.5 Inheritance. Just a thought ¯\(ツ)/¯

krlmlr · 2017-11-13T00:03:41Z

I guess a forward reference would help impatient and superficial readers like myself, but that's a tough balance.

DavisVaughan · 2017-11-13T00:05:11Z

If we are being really picky, the tibble specific attributes go before the ... (see new_my_class() above). Just thought I'd point it out!

function (x, ..., nrow = NULL, subclass = NULL) 

# VS

function (x, nrow = NULL, ..., subclass = NULL)

Agreed. Effective teaching is hard work.

krlmlr · 2017-11-13T00:11:25Z

This was deliberate, users shouldn't be calling new_tibble(x, 3) anyway, and it helps changing the interface later if necessary.

jennybc · 2017-11-15T20:15:27Z

@krlmlr As someone who has subclassed tibble using DIY methods, what's the status here? Would it now be welcome for people to test out the more official ways of doing this? Or not quite yet?

krlmlr · 2017-11-15T22:02:16Z

We now have new_tibble() in the upcoming version (releasing today), but we still need sloop::reconstruct() for proper extensibility.

mbojan · 2017-12-04T14:29:31Z

I look forward to trying these extensions out. I have to say that current behavior of dropping attributes (and subclasses for that matter) seems inconsistent across R. For example subset drops attributes but maintains the class attribute (including subclasses, if any). dplyr verbs drop both. If I recall discussions around S3 and S4 systems correctly, the positions seemed to be that the default behavior should be: when a superclass method is applied to a subclass, the result should be still of subclass. That's how e.g. data frame indexing works. I think if we move away from this, it will be necessary to write a lot of vacuous code, like subclass-dedicated methods for all generics that already work with the superclass...

krlmlr · 2018-01-18T12:28:24Z

Also need a statement about (non-)support of S4 classes.

hadley · 2018-10-06T12:45:56Z

I think this should wait until the next release so we can fully internalise the lessons from vctrs.

krlmlr · 2020-03-02T15:52:13Z

We should move the existing "Extending" vignette to {vctrs}, and adapt to the new infrastructure.

krlmlr · 2021-04-18T03:59:48Z

Topics:

Tweak overall printing: link to pillar vignette
Tweak printing of vector column: link to vctrs vignette
Inherit from tbl or tbl_df?
Main part of the text: Follow ?dplyr_reconstruct
- Example classes: sticky column, ...
- Best practices for inheritance, what works out of the box, what needs adaptation?

krlmlr · 2021-10-25T02:11:35Z

Are we waiting for #890 here?

DavisVaughan mentioned this issue Nov 3, 2017

Is new_tibble() working as it should? #330

Closed

krlmlr added the documentation label Jan 15, 2018

This was referenced Jan 15, 2018

Add documentation how to implement printing for a subclass #364

Closed

S4 object inheriting from data.frame: filter() gives a warning tidyverse/dplyr#2193

Closed

krlmlr added this to the 1.5.0 milestone Oct 5, 2018

krlmlr removed this from the 1.5.0 milestone Oct 6, 2018

philip-khor mentioned this issue Jul 8, 2019

Make pmdplyr tibbles a subclass of tibble using new_tibble() NickCH-K/pmdplyr#2

Merged

krlmlr added this to the 3.1.1 milestone Feb 23, 2021

krlmlr modified the milestones: 3.1.1, 3.1.2 Apr 18, 2021

DavisVaughan mentioned this issue May 17, 2021

Extending tibble: A case for tibble_reconstruct() #890

Open

krlmlr modified the milestones: 3.1.2, 3.1.3 Jun 24, 2021

krlmlr modified the milestones: 3.1.3, 3.1.4 Jul 17, 2021

krlmlr removed this from the 3.1.4 milestone Aug 1, 2021

krlmlr added this to the 3.1.7 milestone Oct 25, 2021

ilikegitlab mentioned this issue Jul 17, 2022

pivot doesn't preserve attributes tidyverse/tidyr#1379

Closed

krlmlr added a commit that referenced this issue Feb 22, 2023

Draft for new extending vignette, closes #275

1d2c587

krlmlr mentioned this issue Feb 22, 2023

Documentation updates #1512

Merged

krlmlr closed this as completed in #1512 Feb 23, 2023

brookslogan mentioned this issue Apr 13, 2023

Review and consider applying ?dplyr_extending for epi_df cmu-delphi/epiprocess#223

Open

github-actions bot locked and limited conversation to collaborators Feb 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify how to extend tibbles #275

Clarify how to extend tibbles #275

hadley commented Jul 5, 2017

hadley commented Jul 5, 2017

hadley commented Oct 4, 2017

DavisVaughan commented Nov 6, 2017

krlmlr commented Nov 7, 2017

DavisVaughan commented Nov 7, 2017

krlmlr commented Nov 12, 2017 •

edited

Loading

DavisVaughan commented Nov 12, 2017 •

edited

Loading

krlmlr commented Nov 12, 2017

DavisVaughan commented Nov 12, 2017

krlmlr commented Nov 13, 2017

DavisVaughan commented Nov 13, 2017

krlmlr commented Nov 13, 2017

jennybc commented Nov 15, 2017

krlmlr commented Nov 15, 2017

mbojan commented Dec 4, 2017

krlmlr commented Jan 18, 2018

hadley commented Oct 6, 2018

krlmlr commented Mar 2, 2020 •

edited

Loading

krlmlr commented Apr 18, 2021 •

edited

Loading

krlmlr commented Oct 25, 2021

Clarify how to extend tibbles #275

Clarify how to extend tibbles #275

Comments

hadley commented Jul 5, 2017

hadley commented Jul 5, 2017

hadley commented Oct 4, 2017

DavisVaughan commented Nov 6, 2017

krlmlr commented Nov 7, 2017

DavisVaughan commented Nov 7, 2017

krlmlr commented Nov 12, 2017 • edited Loading

DavisVaughan commented Nov 12, 2017 • edited Loading

krlmlr commented Nov 12, 2017

DavisVaughan commented Nov 12, 2017

krlmlr commented Nov 13, 2017

DavisVaughan commented Nov 13, 2017

krlmlr commented Nov 13, 2017

jennybc commented Nov 15, 2017

krlmlr commented Nov 15, 2017

mbojan commented Dec 4, 2017

krlmlr commented Jan 18, 2018

hadley commented Oct 6, 2018

krlmlr commented Mar 2, 2020 • edited Loading

krlmlr commented Apr 18, 2021 • edited Loading

krlmlr commented Oct 25, 2021

krlmlr commented Nov 12, 2017 •

edited

Loading

DavisVaughan commented Nov 12, 2017 •

edited

Loading

krlmlr commented Mar 2, 2020 •

edited

Loading

krlmlr commented Apr 18, 2021 •

edited

Loading