-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[ML-DataFrame] Add a frequency option to transform config, default 1m #44120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML-DataFrame] Add a frequency option to transform config, default 1m #44120
Conversation
Previously a data frame transform would check whether the source index was changed every 10 seconds. Sometimes it may be desirable for the check to be done less frequently. This commit increases the default to 60 seconds but also allows the frequency to be overridden by a setting in the data frame transform config.
|
Pinging @elastic/ml-core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| args[3] == null ? null : TimeValue.parseTimeValue((String) args[3], DataFrameField.FREQUENCY.getPreferredName()); | ||
| if (lenient == false && frequency != null) { | ||
| if (frequency.compareTo(MIN_FREQUENCY) < 0) { | ||
| throw new IllegalArgumentException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a huge fan of performing this validation in the parser. The strict parser is used in PUT and preview PutDataFrameTransformAction.Request has a validate method as does PreviewDataFrameTransformAction.Request I think it's better to have all the validation in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I move the validation to the validate method of PutDataFrameTransformAction.Request.
PreviewDataFrameTransformAction.Request's validate method was not validating nearly as much, so I left it out of there. I guess that raises a wider question of whether preview should be validating parts of the config that aren't used in any way when previewing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Will the frequency setting also appear in the "get data frame transforms" API response? If so, we'll also need to update the example results in https://www.elastic.co/guide/en/elasticsearch/reference/master/get-data-frame-transform.html and add it to the properties here: https://www.elastic.co/guide/en/elasticsearch/reference/master/data-frame-transform-resource.html |
|
Config By adding the frequency at the top-level we are somewhat limiting ourselves. I think we need more level of control in the near future, e.g. a cron like schedule (like rollups for example), randomization interval, control of retry. I suggest to nest Doing this re-arrangement can be done later but causes BWC, that's why I think we should do it properly now. I do not think this is premature. Documentation The Let's document as a limitation that (The connection with retry is the reason why we limit max frequency to 1h) |
|
@droberts195 I labelled it |
Yes, the problem is that continuous data frames is already merged to 7.3 and hardcoding the frequency with which it continuously checks for updates at 10 seconds causes problems which didn't occur when that 10 second frequency was just about checking when to retry in 7.2. So arguably this is a bug fix in that it's making continuous data frames fit for release. But it's only a bug fix from the perspective of somebody moving from 7.3.0 BC1 to 7.3.0 BC3/4/whatever. From a release notes perspective it makes sense to document the change in the enhancements section because from a user viewpoint this is new to 7.3. Kibana actually has separate labels for the type of change and which section of the release notes the change should be listed in, although obviously that means twice as many labels so it's not a silver bullet... |
…#44120) Previously a data frame transform would check whether the source index was changed every 10 seconds. Sometimes it may be desirable for the check to be done less frequently. This commit increases the default to 60 seconds but also allows the frequency to be overridden by a setting in the data frame transform config.
…#44120) Previously a data frame transform would check whether the source index was changed every 10 seconds. Sometimes it may be desirable for the check to be done less frequently. This commit increases the default to 60 seconds but also allows the frequency to be overridden by a setting in the data frame transform config.
Previously a data frame transform would check whether the
source index was changed every 10 seconds. Sometimes it
may be desirable for the check to be done less frequently.
This commit increases the default to 60 seconds but also
allows the frequency to be overridden by a setting in the
data frame transform config.