Proposal: easier column extraction and data cleaning

I was thinking that something like the `@df` macro in StatPlots would benefit many different packages, by allowing normal arrays from the output of a query to be fed directly to a function (especially as it can be done without even explicitly collecting the query, see [here](https://github.com/JuliaPlots/StatPlots.jl/pull/96#issuecomment-330629958)). What I was wondering is whether something similar could live in Query as well. I'm thinking about a macro of the style:

```julia
@replace_complete_cols df f(_..a, _..b, _..c .+ 1)
```

which would replace the `_..s` expression with the respective columns converted to a regular Array (it would exclude rows where a column that is being used is missing data). There are two more tools that would be helpful to implement this functionality and would go well together with it:

- a `@dropna` stand-alone macro that would filter rows with no missing values
- as mentioned [here](https://github.com/JuliaPlots/StatPlots.jl/issues/88#issuecomment-328953203), the possibility to have a tuple in a `@select` statement, which could then be collected as a tuple of Arrays. Without that, selecting an arbitrary number of columns is a bit cumbersome (I haven't found a way of selecting multiple columns with a NamedTuple iterator because there doesn't seem to be a type stable way of generating a NamedTuple without manually typing each element, whereas list comprehension works just fine for tuples).

Do you believe that this kind of macro belongs to Query.jl or should it live somewhere else?
Also, what syntax would you think is best? What I put here is pretty much a placeholder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: easier column extraction and data cleaning #146

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: easier column extraction and data cleaning #146

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions