[SPARK-20849][DOC][SPARKR] Document R DecisionTree#18067
[SPARK-20849][DOC][SPARKR] Document R DecisionTree#18067zhengruifeng wants to merge 8 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
I'd say try to use a data set without . in column name if you can.
Probably would be confusion when examples are causing warnings when users run them
There was a problem hiding this comment.
actually, I think there's a confusion - I don't mean to change not to use . in formula
I mean the reason why we have warning=FALSE here is because createDataFrame(longley) will cause a warning because it has column with name with . in it. And we should avoid that if we can
|
Test build #77226 has finished for PR 18067 at commit
|
|
Test build #77230 has started for PR 18067 at commit |
|
Jenkins, retests this please |
|
Jenkins, retest this please |
|
Test build #77239 has finished for PR 18067 at commit
|
|
@felixcheung Updated. By the way, I update other formulas in |
felixcheung
left a comment
There was a problem hiding this comment.
ok, then could you check if we could remove
, warning=FALSE?
There was a problem hiding this comment.
actually, I think there's a confusion - I don't mean to change not to use . in formula
I mean the reason why we have warning=FALSE here is because createDataFrame(longley) will cause a warning because it has column with name with . in it. And we should avoid that if we can
|
Test build #77325 has finished for PR 18067 at commit
|
|
Test build #77324 has finished for PR 18067 at commit
|
There was a problem hiding this comment.
as commented, before, please check. I'm pretty sure createDataFrame(longley) will cause a warning
longley
GNP.deflator GNP Unemployed Armed.Forces Population Year Employed
1947 83.0 234.289 235.6 159.0 107.608 1947 60.323
1948 88.5 259.426 232.5 145.6 108.632 1948 61.122
so our options are:
- don't use longley (my earlier suggestion)
- use longley but keep
warning=FALSE
There was a problem hiding this comment.
option 2: do you mean using ````{r, warning=FALSE}` like other examples?
I think both are OK,.
which do you prefer?
There was a problem hiding this comment.
yes - but as mentioned, if you can think of a data set that doesn't have dot in column name, like as.data.frame(Titanic)
|
Test build #77349 has finished for PR 18067 at commit
|
|
Test build #77351 has finished for PR 18067 at commit
|
|
why change to classification for trees? |
|
@felixcheung just because dataset |
|
merged to master |
What changes were proposed in this pull request?
1, add an example for sparkr
decisionTree2, document it in user guide
How was this patch tested?
local submit