Skip to content

Conversation

@droberts195
Copy link

Previously a data frame transform would check whether the
source index was changed every 10 seconds. Sometimes it
may be desirable for the check to be done less frequently.
This commit increases the default to 60 seconds but also
allows the frequency to be overridden by a setting in the
data frame transform config.

Previously a data frame transform would check whether the
source index was changed every 10 seconds. Sometimes it
may be desirable for the check to be done less frequently.
This commit increases the default to 60 seconds but also
allows the frequency to be overridden by a setting in the
data frame transform config.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

args[3] == null ? null : TimeValue.parseTimeValue((String) args[3], DataFrameField.FREQUENCY.getPreferredName());
if (lenient == false && frequency != null) {
if (frequency.compareTo(MIN_FREQUENCY) < 0) {
throw new IllegalArgumentException(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of performing this validation in the parser. The strict parser is used in PUT and preview PutDataFrameTransformAction.Request has a validate method as does PreviewDataFrameTransformAction.Request I think it's better to have all the validation in one place.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I move the validation to the validate method of PutDataFrameTransformAction.Request.

PreviewDataFrameTransformAction.Request's validate method was not validating nearly as much, so I left it out of there. I guess that raises a wider question of whether preview should be validating parts of the config that aren't used in any way when previewing.

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lcawl
Copy link
Contributor

lcawl commented Jul 9, 2019

Will the frequency setting also appear in the "get data frame transforms" API response? If so, we'll also need to update the example results in https://www.elastic.co/guide/en/elasticsearch/reference/master/get-data-frame-transform.html and add it to the properties here: https://www.elastic.co/guide/en/elasticsearch/reference/master/data-frame-transform-resource.html

@hendrikmuhs
Copy link

hendrikmuhs commented Jul 10, 2019

Config

By adding the frequency at the top-level we are somewhat limiting ourselves. I think we need more level of control in the near future, e.g. a cron like schedule (like rollups for example), randomization interval, control of retry.

I suggest to nest frequency under a schedule object:

{
    "source" : { ... },
    ...
    "schedule": {
        "frequency" : "1m"
    }
...
}

Doing this re-arrangement can be done later but causes BWC, that's why I think we should do it properly now. I do not think this is premature.

Documentation

The frequency also controls the retry-logic, I think this is a current limitation and we should decouple retry from frequency in the future. But for now frequency and retry are connected.

Let's document as a limitation that frequency controls retry.

(The connection with retry is the reason why we limit max frequency to 1h)

@jpountz jpountz added v7.3.1 and removed v7.3.0 labels Jul 10, 2019
@jpountz
Copy link
Contributor

jpountz commented Jul 10, 2019

@droberts195 I labelled it v7.3.1 rather than v7.3.0 since it will only be included in 7.3 if there is a respin. Furthermore I wonder whether this should be included in 7.3 at all since this isn't a bug fix (it is possible that I'm missing context and that this change is important for the usability of data frames).

@droberts195
Copy link
Author

Furthermore I wonder whether this should be included in 7.3 at all since this isn't a bug fix

Yes, the problem is that continuous data frames is already merged to 7.3 and hardcoding the frequency with which it continuously checks for updates at 10 seconds causes problems which didn't occur when that 10 second frequency was just about checking when to retry in 7.2. So arguably this is a bug fix in that it's making continuous data frames fit for release. But it's only a bug fix from the perspective of somebody moving from 7.3.0 BC1 to 7.3.0 BC3/4/whatever. From a release notes perspective it makes sense to document the change in the enhancements section because from a user viewpoint this is new to 7.3. Kibana actually has separate labels for the type of change and which section of the release notes the change should be listed in, although obviously that means twice as many labels so it's not a silver bullet...

@droberts195 droberts195 merged commit b89c3c9 into elastic:master Jul 10, 2019
@droberts195 droberts195 deleted the configurable_frequency branch July 10, 2019 08:35
droberts195 pushed a commit that referenced this pull request Jul 10, 2019
…#44120)

Previously a data frame transform would check whether the
source index was changed every 10 seconds. Sometimes it
may be desirable for the check to be done less frequently.
This commit increases the default to 60 seconds but also
allows the frequency to be overridden by a setting in the
data frame transform config.
droberts195 pushed a commit that referenced this pull request Jul 10, 2019
…#44120)

Previously a data frame transform would check whether the
source index was changed every 10 seconds. Sometimes it
may be desirable for the check to be done less frequently.
This commit increases the default to 60 seconds but also
allows the frequency to be overridden by a setting in the
data frame transform config.
@jpountz jpountz added v7.3.0 and removed v7.3.1 labels Jul 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants