-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mark arrays with no levels as ordered when assigning ordered value #223
Conversation
Adding this special case ensures that `copy!(similar(x), x)` gives an ordered array when `x` is ordered. This is consistent with what `vcat` already does when one of the inputs has no levels. This is needed in particular by the DataFrames `vcat` method, which cannot use our `vcat`.
Looks good. A small, indirectly related question, is why |
Good question. Currently But yeah, what we could do instead of this PR is have |
As usual you understand the consequences better here 😄. Feel free to do whatever you find best. I will just leave some thoughts below. In R neither We in CategoricalArrays.jl allow Given this I think that What would you say for adding two |
Actually, one issue with the approach I mentioned in my previous comment is that
Well we could do that, but I'm not sure anybody would use it. :-) The main problem this PR intends to fix is that |
OK - so I guess we should stick with what you implemented in this PR? |
Well that sounds like the less problematic solution. Indeed I've realized that if we put the orderedness in the type, |
OK - then, again, I think what you proposed is a way to go (as usual considering the best design CategoricalArrays.jl is like writing a research paper experience 😄). |
OK, let's go with that. It's indeed hard to believe how proper handling of categorical data is tricky. Maybe that's why R has completely given up: > x = ordered(c("a", "b", "c"))
> x
[1] a b c
Levels: a < b < c
> y = ordered(c("a", "b", "c"))
> levels(y) <- rev(levels(y))
> y
[1] c b a
Levels: c < b < a
> c(x, y) # (In)famous
[1] 1 2 3 1 2 3
> str(rbind(data.frame(a=x), data.frame(a=y))) # More pernicious
'data.frame': 6 obs. of 1 variable:
$ a: Ord.factor w/ 3 levels "a"<"b"<"c": 1 2 3 3 2 1 |
Adding this special case ensures that
copy!(similar(x), x)
gives an ordered arraywhen
x
is ordered. This is consistent with whatvcat
already does when one of theinputs has no levels.
This is needed in particular by the DataFrames
vcat
method, which cannot use ourvcat
.Fixes JuliaData/DataFrames.jl#2002.