Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

as_tibble_row() doesn't work with date vectors #797

Closed
FrancoisR95 opened this issue Jul 4, 2020 · 12 comments · Fixed by #905
Closed

as_tibble_row() doesn't work with date vectors #797

FrancoisR95 opened this issue Jul 4, 2020 · 12 comments · Fixed by #905
Milestone

Comments

@FrancoisR95
Copy link

In tibble 3.0.1, as_tibble_row(x) returns an error when x is a vector of dates.

The expected behavior would be to return a tibble row containing dates, exactly like as_tibble_col() or tibble() are doing.

1. Example of the issue with as_tibble_row:

      x <- c(col1 = as.Date("2012-04-04"), col2 = as.Date("2012-05-04"))
      as_tibble_row(x)

This returns the following error:

Error: `x` must be a bare vector in `as_tibble_row()`, not Date.

2. On the contrary, the same code is working fine with as_tibble_col:

      x <- c(col1 = as.Date("2012-04-04"), col2 = as.Date("2012-05-04"))
      as_tibble_col(x)

Result:

# A tibble: 2 x 1
  value     
  <date>    
1 2012-04-04
2 2012-05-04

The same is also obtained with tibble(x) instead of as_tibble_col(x).

3. Also a code for creating a 2-column tibble of dates is handled perfectly by tibble():

tibble(col1 = as.Date("2012-02-12"), col2 = as.Date(as.Date("2012-04-01"):as.Date("2012-04-05"), origin="1970-01-01"))
tibble(x)

Result:

# A tibble: 5 x 2
  col1       col2      
  <date>     <date>    
1 2012-02-12 2012-04-01
2 2012-02-12 2012-04-02
3 2012-02-12 2012-04-03
4 2012-02-12 2012-04-04
5 2012-02-12 2012-04-05

Conclusion
The behavior of as_tibble_row() with date vectors seems to be a bug, or at least is very inconsistent and unexpected. It's also really a pity when the code of the as_tibble_row(x) function (new_tibble(as.list(x), nrow = 1)) would work just fine if there wasn't, in the function, an unfortunate test making this unnecessary error... This test was introduced recently (6f572f5#diff-cb6cd20025e0469d14d1ce6271cbf679) in order to try to fix #739 but the fix is much worse than the problem (which in my opinion was not even a real problem, basically it was said that an error message, which was "All elements must be size one, use list() to wrap.", was not clear enough but I think it was rather clear, and better and clearer at least than now getting this unexpected error "x must be a bare vector in as_tibble_row(), not Date" for something which should work).

@lionel-
Copy link
Member

lionel- commented Jul 4, 2020

It's also really a pity when the code of the as_tibble_row(x) function (new_tibble(as.list(x), nrow = 1)) would work just fine if there wasn't, in the function, an unfortunate test making this unnecessary error...

new_tibble() is a bare bones function. The implementation you're suggesting creates corrupt tibbles.

To handle arbitrary vectors, vec_chop2() would be useful but it isn't implemented yet.

library(vctrs)
library(rlang)

vec_chop2 <- function(x) {
  stopifnot(
    vec_is(x),
    # Unimplemented:
    !is.data.frame(x),
    is_null(dim(x))
  )

  names <- names(x)
  if (is_null(names)) {
    vec_chop2_unnamed(x)
  } else {
    vec_chop2_named(x, names)
  }
}
vec_chop2_unnamed <- function(x) {
  n <- vec_size(x)
  out <- vector("list", n)

  for (i in seq_len(n)) {
    out[[i]] <- x[[i]]
  }

  out
}
vec_chop2_named <- function(x, names) {
  n <- vec_size(x)
  out <- vector("list", n)
  out_names <- character(n)

  for (i in seq_len(n)) {
    out[[i]] <- x[[i]]
    out_names[[i]] <- names[[i]]
  }

  names(out) <- out_names
  out
}

vec_chop2() has more consistent handling than as.list(), for instance with factors:

fct <- set_names(factor(c("a", "b")), c("foo", "bar"))

as.list(fct)
#> [[1]]
#> foo
#>   a
#> Levels: a b
#>
#> [[2]]
#> bar
#>   b
#> Levels: a b

vec_chop2(fct)
#> $foo
#> [1] a
#> Levels: a b
#>
#> $bar
#> [1] b
#> Levels: a b

date <- c(as.Date("2020-01-01"), as.Date("2020-01-02"))
date <- set_names(date, c("foo", "bar"))

vec_chop2(date)
#> $foo
#> [1] "2020-01-01"
#>
#> $bar
#> [1] "2020-01-02"
as_tibble_row2 <- function(x) {
  if (!vec_is_list(x)) {
    x <- vec_chop2(x)
  }

  tibble::as_tibble_row(x)
}

as_tibble_row2(fct)
#> # A tibble: 1 x 2
#>   foo   bar
#>   <fct> <fct>
#> 1 a     b

as_tibble_row2(date)
#> # A tibble: 1 x 2
#>   foo        bar
#>   <date>     <date>
#> 1 2020-01-01 2020-01-02

@krlmlr
Copy link
Member

krlmlr commented Jul 4, 2020

as_tibble_row() doesn't document that it requires bare vectors.

We could do new_tibble(map(seq_along(x)), ~ vec_slice(x, .), nrow = 1) . However, just like #740, this requires vec_names(). Any chance we can make this available in vctrs?

@krlmlr krlmlr added this to the 3.1.0 milestone Jul 4, 2020
@FrancoisR95
Copy link
Author

FrancoisR95 commented Jul 4, 2020

Many thanks @lionel- and @krlmlr for your quick replies and explanations.
I still don't understand, however, what is the problem of new_tibble(as.list(x), nrow = 1) with date vectors (while I understand the problem with factors).

In your example of date vector, as.list(x) seems to have exactly the same result as vec_chop2(x).

date <- c(as.Date("2020-01-01"), as.Date("2020-01-02"))
date <- set_names(date, c("foo", "bar"))

vec_chop2(date)
#> $foo
#> [1] "2020-01-01"
#>
#> $bar
#> [1] "2020-01-02"

as.list(date)
#> $foo
#> [1] "2020-01-01"
#>
#> $bar
#> [1] "2020-01-02"

Why not, for the as_tibble_row() function, write a test like:

if (!is_bare_vector(x)) {
  if (class(x) == "Date") {
    if (!is_vector(x)) {
      cnd_signal(error_as_tibble_row_date_vector(x))
    }
  } else {  
    cnd_signal(error_as_tibble_row_bare(x))
  }
}

instead of:

if (!is_bare_vector(x)) {
  cnd_signal(error_as_tibble_row_bare(x))
}

@lionel-
Copy link
Member

lionel- commented Jul 6, 2020

@FrancoisR95 We're looking into a more general solution that would work with any vector.

@krlmlr

We could do new_tibble(map(seq_along(x)), ~ vec_slice(x, .), nrow = 1) . However, just like #740, this requires vec_names(). Any chance we can make this available in vctrs?

Yes I think we could now expose vec_names() and vec_set_names(). How do you plan to deal with names exactly?

@krlmlr
Copy link
Member

krlmlr commented Jul 6, 2020

The names of the vector become column names here?

@lionel-
Copy link
Member

lionel- commented Jul 6, 2020

Yup so we need to remove the inner names and use them as outer names. Also data frames should be special-cased here so I'm not sure where vec-names would come into play. That's why I thought vec-chop2 as outlined above would be appropriate for non-list and non-df inputs.

@krlmlr
Copy link
Member

krlmlr commented Jul 6, 2020

I don't follow. Yes, I can use inner names as outer names with vec_names() (forgot to actually submit that PR). Why would we special-case data frames? Wouldn't vec_slice() take care of that?

If we can make vec_chop2() available in vctrs -- fine. Maybe we can prototype here?

@lionel-
Copy link
Member

lionel- commented Jul 6, 2020

We also need to remove the inner names otherwise they linger in the columns.

Regarding data frames, I assumed as_tibble_row(mtcars[1, ]) should be a no-op and fail if the input is not size 1? It seems strange to transform each row to a column, and would only work for data frames with row names which is strange as well.

@krlmlr
Copy link
Member

krlmlr commented Jul 20, 2020

I have a draft in 22bfa2a, not sure if I got it right. Currently, as_tibble_row() unpacks bare lists -- not sure how to keep that behavior if we allow arbitrary vectors without checking is_bare_list(). Would vec_chop2() help here?

@lionel-
Copy link
Member

lionel- commented Jul 20, 2020

Would vec_chop2() help here?

chop2 would help giving the same implementation to atomic vectors and lists. But you could also make S3 lists follow the same code path as for bare lists if you prefer? S3 lists should be unpacked the same way as bare lists right?

@krlmlr
Copy link
Member

krlmlr commented Sep 27, 2020

vec_names2() and friends available in vctrs 0.3.2.

@krlmlr krlmlr modified the milestones: 3.1.0, 3.1.1 Feb 23, 2021
@krlmlr krlmlr modified the milestones: 3.1.1, 3.1.2 Apr 16, 2021
@krlmlr krlmlr modified the milestones: 3.1.2, 3.1.3 Jul 17, 2021
krlmlr added a commit that referenced this issue Jul 21, 2021
- `as_tibble_row()` supports arbitrary vectors (#797).
@github-actions
Copy link
Contributor

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

@github-actions github-actions bot locked and limited conversation to collaborators Jul 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants