-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FR: Make a data frame from a (possibly named) vector or list #31
Comments
What verb would you use for this operation? How about:
Need to coerce to list if
I think this could be fixed by implementing EDIT: The above also doesn't work if EDIT²: I think a new verb would help here, I'm not sure if this belongs here or in tidyr. |
I will try to propose a name. |
The more I think about it, maybe it makes sense to think about this as a treatment applied to a variable during the construction of a data frame. A way to say "add this variable AND promote its names to a proper variable". And also give some nice way of getting row numbers into the data frame? I have found I realize library(tibble)
x <- list(alpha = 'horrible', beta = 'list', gamma = 'column') What if something like this: df <- data_frame(id(x, "greek"))
## or
df <- data_frame(upname(x, "greek")) produced this result data_frame(greek = names(x), x = x)
#> Source: local data frame [3 x 2]
#>
#> greek x
#> (chr) (list)
#> 1 alpha <chr[1]>
#> 2 beta <chr[1]>
#> 3 gamma <chr[1]> I also wish it were easier to get plain row numbers. I wish this is what df <- data_frame(i = row_number(), upname(x, "greek")) produced something like this: data_frame(i = seq_along(x),
greek = names(x),
x = x)
#> Source: local data frame [3 x 3]
#>
#> i greek x
#> (int) (chr) (list)
#> 1 1 alpha <chr[1]>
#> 2 2 beta <chr[1]>
#> 3 3 gamma <chr[1]> |
In addition to |
Row numbers: Have you seen #11? Your example would be then
How would you like:
This allows at least the creation of a two-column data frame from a named object, which then can be massaged further with the other dplyr verbs, and combined with other data frames using cbind(). @hadley: Would this perhaps be suitable for purrr: unzip_names <- function(x) set_names(list(x, names(x)), c("name", "value"))
zip_names <- function(x) set_names(x[[1]], x[[2]]) |
Sorry I can't really tell what #11 does just from reading the discussion. But I take your word for it that it would add the integers 1 through library(tibble)
x <- list(alpha = 'horrible', beta = 'list', gamma = 'column')
dub <- function(x) as_data_frame(setNames(list(names(x), x), c("name", "value")))
dub(x)
#> Source: local data frame [3 x 2]
#>
#> name value
#> (chr) (list)
#> 1 alpha <chr[1]>
#> 2 beta <chr[1]>
#> 3 gamma <chr[1]> The variables themselves and the object look great. Would UPDATE: I think |
What if this was just the |
@jennybc: purrr: Right, revised definition below.
rownames_to_column() will add to the front. Actually, we already have add_rownames() in dplyr, but it does "too much" and will be deprecated in favor of the new functions. Defaults: I think we should support them, even if the renaming could be handle with a simple @hadley: Is there a dispatch for |
@krlmlr No, there's no "vector" virtual class, so implementation would be a bit tedious. But we don't need to have methods for It seems like we're adding new functionality that previously was an error, so it doesn't seem too dangerous to me. |
@jennybc: For now you could try kimisc::list_to_df() -- I totally forgot about this guy. I still think this should be part of tibble. |
Apologies if this is not helpful, but could library("dplyr")
library("purrr")
x <- list(alpha = 'horrible', beta = 'list', gamma = 'column')
x %>% map_df(~ data_frame(thing = .x), .id = "name") Could a new verb be put in place of the function within |
Note that this is already possible with #71:
|
- New `enframe()` that converts vectors to two-column tibbles (#31, #74). - Fix compatibility with `knitr` 1.13 (#76). - Implement `as_data_frame.default()` (#71, tidyverse/dplyr#1752).
Follow-up release. - `tibble()` is no longer an alias for `frame_data()` (#82). - Remove `tbl_df()` (#57). - `$` returns `NULL` if column not found, without partial matching. A warning is given (#109). - `[[` returns `NULL` if column not found (#109). - Reworked output: More concise summary (begins with hash `#` and contains more text (#95)), removed empty line, showing number of hidden rows and columns (#51). The trailing metadata also begins with hash `#` (#101). Presence of row names is indicated by a star in printed output (#72). - Format `NA` values in character columns as `<NA>`, like `print.data.frame()` does (#69). - The number of printed extra cols is now an option (#68, @lionel-). - Computation of column width properly handles wide (e.g., Chinese) characters, tests still fail on Windows (#100). - `glimpse()` shows nesting structure for lists and uses angle brackets for type (#98). - Tibbles with `POSIXlt` columns can be printed now, the text `<POSIXlt>` is shown as placeholder to encourage usage of `POSIXct` (#86). - `type_sum()` shows only topmost class for S3 objects. - Strict checking of integer and logical column indexes. For integers, passing a non-integer index or an out-of-bounds index raises an error. For logicals, only vectors of length 1 or `ncol` are supported. Passing a matrix or an array now raises an error in any case (#83). - Warn if setting non-`NULL` row names (#75). - Consistently surround variable names with single quotes in error messages. - Use "Unknown column 'x'" as error message if column not found, like base R (#94). - `stop()` and `warning()` are now always called with `call. = FALSE`. - The `.Dim` attribute is silently stripped from columns that are 1d matrices (#84). - Converting a tibble without row names to a regular data frame does not add explicit row names. - `as_tibble.data.frame()` preserves attributes, and uses `as_tibble.list()` to calling overriden methods which may lead to endless recursion. - New `has_name() (#102). - Prefer `tibble()` and `as_tibble()` over `data_frame()` and `as_data_frame()` in code and documentation (#82). - New `is.tibble()` and `is_tibble()` (#79). - New `enframe()` that converts vectors to two-column tibbles (#31, #74). - `obj_sum()` and `type_sum()` show `"tibble"` instead of `"tbl_df"` for tibbles (#82). - `as_tibble.data.frame()` gains `validate` argument (as in `as_tibble.list()`), if `TRUE` the input is validated. - Implement `as_tibble.default()` (#71, tidyverse/dplyr#1752). - `has_rownames()` supports arguments that are not data frames. - Two-dimensional indexing with `[[` works (#58, #63). - Subsetting with empty index (e.g., `x[]`) also removes row names. - Document behavior of `as_tibble.tbl_df()` for subclasses (#60). - Document and test that subsetting removes row names. - Don't rely on `knitr` internals for testing (#78). - Fix compatibility with `knitr` 1.13 (#76). - Enhance `knit_print()` tests. - Provide default implementation for `tbl_sum.tbl_sql()` and `tbl_sum.tbl_grouped_df()` to allow `dplyr` release before a `tibble` release. - Explicit tests for `format_v()` (#98). - Test output for `NULL` value of `tbl_sum()`. - Test subsetting in all variants (#62). - Add missing test from dplyr. - Use new `expect_output_file()` from `testthat`.
This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary. |
Here's something I do fairly often, mostly with a list, but sometimes with a vector: Initialize a data frame with that list or vector as a variable and, at the same time, promote its names to a proper variable. Or, perhaps, add a variable of row numbers. Why is it so important to add the names or row numbers? Because later you'll want to process with
tidyr
, i.e. withunnest()
and/orspread()
.I could point to some real uses if I need to really sell this. But hopefully this will just make sense. Or someone will tell me it's already easy to do? It is already easy, but perhaps worth making a function for.
The text was updated successfully, but these errors were encountered: