Skip to content

Commit 6c66ab8

Browse files
huangweizhe123srowen
authored andcommitted
[SPARK-24688][EXAMPLES] Modify the comments about LabeledPoint
## What changes were proposed in this pull request? An RDD is created using LabeledPoint, but the comment is like #LabeledPoint(feature, label). Although in the method ChiSquareTest.test, the second parameter is feature and the third parameter is label, it it better to write label in front of feature here because if an RDD is created using LabeldPoint, what we get are actually (label, feature) pairs. Now it is changed as LabeledPoint(label, feature). The comments in Scala and Java example have the same typos. ## How was this patch tested? tested https://issues.apache.org/jira/browse/SPARK-24688 Author: Weizhe Huang 492816239qq.com Please review http://spark.apache.org/contributing.html before opening a pull request. Closes #21665 from uzmijnlm/my_change. Authored-by: Huangweizhe <[email protected]> Signed-off-by: Sean Owen <[email protected]>
1 parent 3e4f166 commit 6c66ab8

File tree

3 files changed

+4
-4
lines changed

3 files changed

+4
-4
lines changed

examples/src/main/java/org/apache/spark/examples/mllib/JavaHypothesisTestingExample.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ public static void main(String[] args) {
6767
)
6868
);
6969

70-
// The contingency table is constructed from the raw (feature, label) pairs and used to conduct
70+
// The contingency table is constructed from the raw (label, feature) pairs and used to conduct
7171
// the independence test. Returns an array containing the ChiSquaredTestResult for every feature
7272
// against the label.
7373
ChiSqTestResult[] featureTestResults = Statistics.chiSqTest(obs.rdd());

examples/src/main/python/mllib/hypothesis_testing_example.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@
5151
[LabeledPoint(1.0, [1.0, 0.0, 3.0]),
5252
LabeledPoint(1.0, [1.0, 2.0, 0.0]),
5353
LabeledPoint(1.0, [-1.0, 0.0, -0.5])]
54-
) # LabeledPoint(feature, label)
54+
) # LabeledPoint(label, feature)
5555

5656
# The contingency table is constructed from an RDD of LabeledPoint and used to conduct
5757
# the independence test. Returns an array containing the ChiSquaredTestResult for every feature

examples/src/main/scala/org/apache/spark/examples/mllib/HypothesisTestingExample.scala

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,9 +61,9 @@ object HypothesisTestingExample {
6161
LabeledPoint(-1.0, Vectors.dense(-1.0, 0.0, -0.5)
6262
)
6363
)
64-
) // (feature, label) pairs.
64+
) // (label, feature) pairs.
6565

66-
// The contingency table is constructed from the raw (feature, label) pairs and used to conduct
66+
// The contingency table is constructed from the raw (label, feature) pairs and used to conduct
6767
// the independence test. Returns an array containing the ChiSquaredTestResult for every feature
6868
// against the label.
6969
val featureTestResults: Array[ChiSqTestResult] = Statistics.chiSqTest(obs)

0 commit comments

Comments
 (0)