[ML] Data frame source index seems to be fully re-read when continuous transform moves nodes

In a rolling upgrade, or node failure scenario, a continuous transform seems to re-read the entire index when it is moved to a new node. I think the last checkpoints timestamp information may be missed when the transform starts executing again on its new node. 


I saw this behavior by looking at the stats between the the continuous transform moving between nodes. The documents read increased by the number of docs present in the index each time it changed nodes. 

It could either be:  
* the stats are being gathered incorrectly (the docs are not actually being read)
* The entire index is being read again when the task starts again. 

`DataFrameTransformPersistentTasksExecutor` and how it reads in checkpoint information is suspect. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ML] Data frame source index seems to be fully re-read when continuous transform moves nodes #43662

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[ML] Data frame source index seems to be fully re-read when continuous transform moves nodes #43662

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions