Skip to content

Commit 4977baf

Browse files
authored
[DOCS] Adds examples to the PUT dfa and the evaluate dfa APIs (elastic#46966)
* [DOCS] Adds examples to the PUT dfa and the evaluate dfa APIs. * [DOCS] Removes extra lines from examples. * Update docs/reference/ml/df-analytics/apis/evaluate-dfanalytics.asciidoc Co-Authored-By: Lisa Cawley <[email protected]> * Update docs/reference/ml/df-analytics/apis/put-dfanalytics.asciidoc Co-Authored-By: Lisa Cawley <[email protected]> * [DOCS] Explains examples.
1 parent 44d4cf7 commit 4977baf

File tree

2 files changed

+103
-1
lines changed

2 files changed

+103
-1
lines changed

docs/reference/ml/df-analytics/apis/evaluate-dfanalytics.asciidoc

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -172,3 +172,79 @@ only.
172172
<3> The ground truth value for the actual house price. This is required in order
173173
to evaluate results.
174174
<4> The predicted value for house price calculated by the {reganalysis}.
175+
176+
177+
The following example calculates the training error:
178+
179+
[source,console]
180+
--------------------------------------------------
181+
POST _ml/data_frame/_evaluate
182+
{
183+
"index": "student_performance_mathematics_reg",
184+
"query": {
185+
"term": {
186+
"ml.is_training": {
187+
"value": true <1>
188+
}
189+
}
190+
},
191+
"evaluation": {
192+
"regression": {
193+
"actual_field": "G3", <2>
194+
"predicted_field": "ml.G3_prediction", <3>
195+
"metrics": {
196+
"r_squared": {},
197+
"mean_squared_error": {}
198+
}
199+
}
200+
}
201+
}
202+
--------------------------------------------------
203+
// TEST[skip:TBD]
204+
205+
<1> In this example, a test/train split (`training_percent`) was defined for the
206+
{reganalysis}. This query limits evaluation to be performed on the train split
207+
only. It means that a training error will be calculated.
208+
<2> The field that contains the ground truth value for the actual student
209+
performance. This is required in order to evaluate results.
210+
<3> The field that contains the predicted value for student performance
211+
calculated by the {reganalysis}.
212+
213+
214+
The next example calculates the testing error. The only difference compared with
215+
the previous example is that `ml.is_training` is set to `false` this time, so
216+
the query excludes the train split from the evaluation.
217+
218+
[source,console]
219+
--------------------------------------------------
220+
POST _ml/data_frame/_evaluate
221+
{
222+
"index": "student_performance_mathematics_reg",
223+
"query": {
224+
"term": {
225+
"ml.is_training": {
226+
"value": false <1>
227+
}
228+
}
229+
},
230+
"evaluation": {
231+
"regression": {
232+
"actual_field": "G3", <2>
233+
"predicted_field": "ml.G3_prediction", <3>
234+
"metrics": {
235+
"r_squared": {},
236+
"mean_squared_error": {}
237+
}
238+
}
239+
}
240+
}
241+
--------------------------------------------------
242+
// TEST[skip:TBD]
243+
244+
<1> In this example, a test/train split (`training_percent`) was defined for the
245+
{reganalysis}. This query limits evaluation to be performed on the test split
246+
only. It means that a testing error will be calculated.
247+
<2> The field that contains the ground truth value for the actual student
248+
performance. This is required in order to evaluate results.
249+
<3> The field that contains the predicted value for student performance
250+
calculated by the {reganalysis}.

docs/reference/ml/df-analytics/apis/put-dfanalytics.asciidoc

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -177,7 +177,7 @@ The API returns the following result:
177177

178178

179179
[[ml-put-dfanalytics-example-r]]
180-
===== {regression-cap} example
180+
===== {regression-cap} examples
181181

182182
The following example creates the `house_price_regression_analysis`
183183
{dfanalytics-job}, the analysis type is `regression`:
@@ -235,3 +235,29 @@ The API returns the following result:
235235
// TESTRESPONSE[s/1567168659127/$body.$_path/]
236236
// TESTRESPONSE[s/"version": "8.0.0"/"version": $body.version/]
237237

238+
239+
The following example creates a job and specifies a training percent:
240+
241+
[source,console]
242+
--------------------------------------------------
243+
PUT _ml/data_frame/analytics/student_performance_mathematics_0.3
244+
{
245+
"source": {
246+
"index": "student_performance_mathematics"
247+
},
248+
"dest": {
249+
"index":"student_performance_mathematics_reg"
250+
},
251+
"analysis":
252+
{
253+
"regression": {
254+
"dependent_variable": "G3",
255+
"training_percent": 70 <1>
256+
}
257+
}
258+
}
259+
--------------------------------------------------
260+
// TEST[skip:TBD]
261+
262+
<1> The `training_percent` defines the percentage of the data set that will be used
263+
for training the model.

0 commit comments

Comments
 (0)