-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[ML] data frame, adding builder classes for complex config classes #41638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] data frame, adding builder classes for complex config classes #41638
Conversation
|
Pinging @elastic/ml-core |
| return Strings.toString(this, true, true); | ||
| } | ||
|
|
||
| public static Builder builder() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it would make sense to make DataFrameTransformConfig constructor private now that we have "builder()" as a recommended entry point to this class? The same question for other classes...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good thought, How about "package private" so that tests can still utilize it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Package-private SGTM
| return this; | ||
| } | ||
|
|
||
| public Builder setInterval(long interval) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this method accept "Duration" instead of "long"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to make this parallel the DateHistogramAggregationBuilder. Though I should add docs stating what time resolution interval is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if we talk about usability here, it is usually error-prone if the user has to provide "long" but does not know what the unit is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@przemekwitek I added some docs and a new method that accepts a TimeValue parameter as well.
| } | ||
|
|
||
| public DateHistogramGroupSource build() { | ||
| DateHistogramGroupSource groupSource = new DateHistogramGroupSource(field); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introducing Builder is a great opportunity to make the data class itself (here, DateHistogramGroupSource) immutable. Do you plan on making it so or you find such an approach too restrictive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We totally should do this. I will see if I can make them final.
| private final Map<String, SingleGroupSource> groups; | ||
|
|
||
| public Builder() { | ||
| this.groups = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could move initialization to line 184.
| return this; | ||
| } | ||
|
|
||
| public Builder setInterval(double interval) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question, should we use "Duration" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that does not make sense here as interval in this case is any arbitrary number. Any numeric field can be grouped via a histogram, not just a time field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A, ok. So this was my misunderstanding (I associated "interval" with some time duration).
przemekwitek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a couple of inline comments.
|
|
||
| public DataFrameTransformConfig(final String id, | ||
| DataFrameTransformConfig(final String id, | ||
| final SourceConfig source, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix indentation
| return Objects.hash(field, interval); | ||
| } | ||
|
|
||
| public static Builder builder() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason, I cannot leave comment in the line 55. Anyway, leaving it here: Please make the constructor package-private
przemekwitek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I value immutability so I believe this PR is a step in a right direction.
…lastic#41638) * [ML] data frame, adding builder classes for complex config classes * Addressing PR comments, adding some java docs * cleaning up constructor * fixing indentation * change constructors to be package-private
…lastic#41638) * [ML] data frame, adding builder classes for complex config classes * Addressing PR comments, adding some java docs * cleaning up constructor * fixing indentation * change constructors to be package-private
…lastic#41638) * [ML] data frame, adding builder classes for complex config classes * Addressing PR comments, adding some java docs * cleaning up constructor * fixing indentation * change constructors to be package-private
There are many ways to make the use of data frames simpler through the HLRC.
A step in the right direction is providing fluent builders for each of the more complicated configs.