-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Labels
Description
This was found on a recent 7.7 build. I created a classification analysis that became stuck in the writing_results phase.
[2020-03-17T12:58:57,528][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [reba.lan] [openml-kr-vs-kp-classifier-0] [data_frame_analyzer/8755] [CBoostedTreeImpl.cc@241] Training finished after 18 iterations. Time per iteration in ms mean: 1287.84 std. dev: 2697.19
[2020-03-17T12:58:57,626][INFO ][o.e.x.m.d.p.AnalyticsResultProcessor] [reba.lan] [openml-kr-vs-kp-classifier-0] Started writing results
[2020-03-17T12:58:57,882][INFO ][o.e.c.m.MetaDataMappingService] [reba.lan] [openml-kr-vs-kp-classified-0/h_cT5mm3QSWfTv6b3ZBlTQ] update_mapping [_doc]
[2020-03-17T12:58:58,149][WARN ][o.e.x.m.u.p.ResultsPersisterService] [reba.lan] [openml-kr-vs-kp-classifier-0] failed to index after [1] attempts. Will attempt again in [50ms].
[2020-03-17T12:58:58,361][WARN ][o.e.x.m.u.p.ResultsPersisterService] [reba.lan] [openml-kr-vs-kp-classifier-0] failed to index after [2] attempts. Will attempt again in [75ms].
[2020-03-17T12:58:58,570][WARN ][o.e.x.m.u.p.ResultsPersisterService] [reba.lan] [openml-kr-vs-kp-classifier-0] failed to index after [3] attempts. Will attempt again in [276ms].
...lots more retires here...
[2020-03-17T13:16:19,080][WARN ][o.e.x.m.u.p.ResultsPersisterService] [reba.lan] [openml-kr-vs-kp-classifier-0] failed to index after [15] attempts. Will attempt again in [846433ms]
Stopping the job via the UI removed the job from the jobs list, but the job remains in a stopping state. The retires continue even after stopping the job:
[2020-03-17T13:44:24,643][INFO ][o.e.x.m.a.TransportStopDataFrameAnalyticsAction] [reba.lan] [openml-kr-vs-kp-classifier-0] Stopping task with force [true]
[2020-03-17T13:44:24,668][INFO ][o.e.x.m.a.TransportStopDataFrameAnalyticsAction] [reba.lan] [openml-kr-vs-kp-classifier-0] Stopping task with force [true]
[2020-03-17T13:44:24,669][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [reba.lan] [controller/6507] [CDetachedProcessSpawner.cc@177] Child process with PID 8755 was terminated by signal 15
[2020-03-17T13:44:24,670][ERROR][o.e.x.m.p.l.CppLogMessageHandler] [reba.lan] [controller/6507] [CDetachedProcessSpawner.cc@99] Will not attempt to kill process 8755: not a child process
[2020-03-17T13:44:24,670][ERROR][o.e.x.m.p.l.CppLogMessageHandler] [reba.lan] [controller/6507] [CCommandProcessor.cc@96] Failed to kill process with PID 8755
[2020-03-17T13:45:13,300][WARN ][o.e.x.m.u.p.ResultsPersisterService] [reba.lan] [openml-kr-vs-kp-classifier-0] failed to index after [17] attempts. Will attempt again in [884782ms]
.
.
.
[2020-03-17T14:14:48,971][WARN ][o.e.x.m.u.p.ResultsPersisterService] [reba.lan] [openml-kr-vs-kp-classifier-0] failed to index after [19] attempts. Will attempt again in [850734ms]
job config:
{
"id" : "openml-kr-vs-kp-classifier-0",
"source" : {
"index" : [
"openml-kr-vs-kp"
],
"query" : {
"match_all" : { }
}
},
"dest" : {
"index" : "openml-kr-vs-kp-classified-0",
"results_field" : "ml"
},
"analysis" : {
"classification" : {
"dependent_variable" : "class",
"class_assignment_objective" : "maximize_accuracy",
"num_top_classes" : 2,
"prediction_field_name" : "class_prediction",
"training_percent" : 90.0,
"randomize_seed" : 7077816937788972687
}
},
"model_memory_limit" : "512mb",
"create_time" : 1584464253331,
"version" : "7.7.0",
"allow_lazy_start" : false
}