Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Would it be a good idea for dplyr verbs to propagate NULL data frames? #1638

Closed
skranz opened this issue Jan 23, 2016 · 4 comments
Closed

Would it be a good idea for dplyr verbs to propagate NULL data frames? #1638

skranz opened this issue Jan 23, 2016 · 4 comments

Comments

@skranz
Copy link

skranz commented Jan 23, 2016

I just wonder whether perhaps one could make dplyr verbs to act more like base R and propagate NULL data frames instead of throwing an error. Here is a simple example:

  # Assume you sometimes have no data
  # and df is NULL
  df = NULL

  # Base R does not throw errors quickly
  # but conveniently propagates NULL
  df[df$a==1,]
  ## NULL

  # dplyr verbs are not as tolerant...
  filter(df, a==1)
  ## Error in UseMethod("filter_") : 
  ## no applicable method for 'filter_' applied to an object of class "NULL"

Returning NULL instead of an error could probably be easily implemented by adding functions like:

filter_.NULL = function(...) return(NULL)

While this is not a big issue, I would find NULL propagation quite convenient, since one can get rid of some case distinctions and dplyr would probably behave more like base R.

Conceptually it also seems sound to me to return NULL:
If I filter something from NULL, it seems intuitive that the result is NULL.
If I arrange NULL the result should also be NULL.
If I mutate NULL the result should also be NULL.
... and so on...

@krlmlr
Copy link
Member

krlmlr commented Jan 25, 2016

In the tibble package, tbl_df(NULL) returns an empty data frame (https://github.com/krlmlr/tibble/pull/17). So, inserting tbl_df early in your pipeline should help for your use case. It will be imported by default when #1595 is merged.

@hadley
Copy link
Member

hadley commented Mar 1, 2016

I think it's a bad idea to silently coerce NULL (which is not a data frame) into a data frame.

You'd be better off using an empty data frame, i.e. data_frame()

@hadley hadley closed this as completed Mar 1, 2016
@krlmlr
Copy link
Member

krlmlr commented Mar 1, 2016

What about explicit coercion using tbl_df() or as_data_frame()?

@hadley
Copy link
Member

hadley commented Mar 1, 2016

It's probably reasonable for as_data_frame to have a NULL method

@lock lock bot locked as resolved and limited conversation to collaborators Jun 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants