@@ -1109,21 +1109,20 @@ scaledData = scalerModel.transform(dataFrame)
11091109` MinMaxScaler ` computes summary statistics on a data set and produces a ` MinMaxScalerModel ` . The model can then transform each feature individually such that it is in the given range.
11101110
11111111The rescaled value for a feature E is calculated as,
1112-
1112+ `\begin{equation}
11131113 Rescaled(e_i) = \frac{e_i - E_ {min}}{E_ {max} - E_ {min}} * (max - min) + min
1114-
1115- For the case E_ {max} == E_ {min}, Rescaled(e_i) = 0.5 * (max + min)
1114+ \end{equation}`
1115+ For the case ` E_{max} == E_{min} ` , ` Rescaled(e_i) = 0.5 * (max + min) `
11161116
11171117Note that since zero values will probably be transformed to non-zero values, output of the transformer will be DenseVector even for sparse input.
11181118
1119- More details can be found in the API docs for
1120- [ MinMaxScaler] ( api/scala/index.html#org.apache.spark.ml.feature.MinMaxScaler ) and
1121- [ MinMaxScalerModel] ( api/scala/index.html#org.apache.spark.ml.feature.MinMaxScalerModel ) .
1122-
11231119The following example demonstrates how to load a dataset in libsvm format and then rescale each feature to [ 0, 1] .
11241120
11251121<div class =" codetabs " >
11261122<div data-lang =" scala " >
1123+ More details can be found in the API docs for
1124+ [ MinMaxScaler] ( api/scala/index.html#org.apache.spark.ml.feature.MinMaxScaler ) and
1125+ [ MinMaxScalerModel] ( api/scala/index.html#org.apache.spark.ml.feature.MinMaxScalerModel ) .
11271126{% highlight scala %}
11281127import org.apache.spark.ml.feature.MinMaxScaler
11291128import org.apache.spark.mllib.util.MLUtils
@@ -1134,15 +1133,18 @@ val scaler = new MinMaxScaler()
11341133 .setInputCol("features")
11351134 .setOutputCol("scaledFeatures")
11361135
1137- // Compute summary statistics by fitting the StandardScaler
1136+ // Compute summary statistics and generate MinMaxScalerModel
11381137val scalerModel = scaler.fit(dataFrame)
11391138
1140- // Normalize each feature to have unit standard deviation .
1139+ // rescale each feature to range [ min, max ] .
11411140val scaledData = scalerModel.transform(dataFrame)
11421141{% endhighlight %}
11431142</div >
11441143
11451144<div data-lang =" java " >
1145+ More details can be found in the API docs for
1146+ [ MinMaxScaler] ( api/java/index.html#org.apache.spark.ml.feature.MinMaxScaler ) and
1147+ [ MinMaxScalerModel] ( api/java/index.html#org.apache.spark.ml.feature.MinMaxScalerModel ) .
11461148{% highlight java %}
11471149import org.apache.spark.api.java.JavaRDD;
11481150import org.apache.spark.ml.feature.MinMaxScaler;
@@ -1158,10 +1160,10 @@ MinMaxScaler scaler = new MinMaxScaler()
11581160 .setInputCol("features")
11591161 .setOutputCol("scaledFeatures");
11601162
1161- // Compute summary statistics by fitting the StandardScaler
1163+ // Compute summary statistics and generate MinMaxScalerModel
11621164MinMaxScalerModel scalerModel = scaler.fit(dataFrame);
11631165
1164- // Normalize each feature to have unit standard deviation .
1166+ // rescale each feature to range [ min, max ] .
11651167DataFrame scaledData = scalerModel.transform(dataFrame);
11661168{% endhighlight %}
11671169</div >
0 commit comments