Skip to content

Conversation

@costin
Copy link
Member

@costin costin commented Jun 25, 2020

Introduce pipe support, in particular head and tail
(which can also be chained).

Replaces #58326

Introduce pipe support, in particular head and tail
(which can also be chained).
@costin costin added the :Analytics/EQL EQL querying label Jun 25, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-ql (:Query Languages/EQL)

@elasticmachine elasticmachine added the Team:QL (Deprecated) Meta label for query languages team label Jun 25, 2020
@costin
Copy link
Member Author

costin commented Jun 25, 2020

The main idea behind this PR is to support head/tail through a limit (w/ offset). Further more when dealing with tail, data needs to be returned in descending format however since ES does not offer a search_before construct, this is implemented by making non-base queries in a sequence to be ASC and then looked at in reverse.

The code can be further polished, in particular the query side and progression of the keys (so its handling doesn't leak inside the ExecutionManager) however I opted to do that in a follow-up PR, when chasing internal pagination (as that's when the keys will be picked up by the same queries as it advances through the data).

costin added 2 commits June 25, 2020 20:58
Add Head and Tail to simplify verification and future identification of
said nodes.
1607252647000
]
],
"fields": {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrodewig is there a way to ignore the "fields" field? This data is used internally and it might be that in the future, we'll remove it from the final response (so only _source is included).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the filter_path query param to exclude fields from the output.
You can add the query param using a //TEST comment so it's not visible to users.

This should fail gracefully. The request won't return an error if fields is later removed from the response.

Here's an example with a basic query:

[source,console]
----
GET /sec_logs/_eql/search
{
  "query": """
    process where process.name == "cmd.exe"
  """
}
----
// TEST[s/search/search\?filter_path\=\-\*\.events\.\*fields/]

A sequence query will need to use a different filter because the response format is different.

[source,console]
----
GET /my_index/_eql/search
{
  "query": """
    sequence by agent.id
      [ file where file.name == "cmd.exe" and agent.id != "my_user" ]
      [ process where stringContains(process.path, "regsvr32") ]
  """
}
----
// TEST[s/search/search\?filter_path\=\-\*\.sequences\.events\.\*fields/]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason this didn't seem to work consistently so I left things in a bit of a mixed state, namely the fields tag is still in there but I also added the TESTRESPONSE; the idea being that the user should not depend on whether fields (or even sort) are in the response.
Can you please take a look at it after this PR gets merged?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing. Thanks!

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left some comments.

private final HitExtractor tiebreakerExtractor;

// search after markers
private Object[] startMarker;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe name these with the plural variant, since they are arrays?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had them as such but in case of a timestamp there's only one marker. Plus the fact that it's one or multiple fields is an implementation detail in the end.
Hence why talking a marker as an entity (which currently happens to be an array of objects) seems better.

tiebreaker = tiebreaker(hit);
}

return tiebreaker != null ? new Object[] { timestamp, tiebreaker } : new Object[] { timestamp };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the markers can only be a combination of timestamp and tiebreaker OR only the timestamp, maybe abstract these two variants away in a TiebreakerMarker and TimestampMarker or something around these lines?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the whole marker bit can be encapsulated in the Criterion itself, I plan to revisit this when working on the pagination.


@Override
public String toString() {
return key + "[" + timestamp + "][" + (tiebreaker != null ? Objects.toString(tiebreaker) : "") + "]";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the tiebreaker is not set, why still logging the square brackets?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To indicate there is no tiebreaker - I find the message more consistent to be [..][] vs [..]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Myself, I wouldn't mind to have [] when there is no tiebreaker.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about specifying that there is no tiebreaker? (ie [...][no tiebreaker])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's quite verbose...

}
}

abstract static class QueryFoldingRule<SubPlan extends UnaryExec> extends FoldingRule<SubPlan> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this being used anywhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Used it for the existing rules

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, left some minor comments, but additionally can we have some unit tests for the new optimizer rules: SortByLimit, PushDownOrderBy and SkipQueryOnLimitZero? (unless I missed something :) )


@Override
public String toString() {
return key + "[" + timestamp + "][" + (tiebreaker != null ? Objects.toString(tiebreaker) : "") + "]";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Myself, I wouldn't mind to have [] when there is no tiebreaker.

Attribute timestamp,
Attribute tiebreaker) {
this(source, combine(matches, until), combine(keys, singletonList(untilKeys)), timestamp, tiebreaker);
Attribute tiebreaker, OrderDirection direction) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: code formatting for consistency:

Suggested change
Attribute tiebreaker, OrderDirection direction) {
Attribute tiebreaker,
OrderDirection direction) {

this.attributes = attributes;
this.trackHits = trackHits;
this.includeFrozen = includeFrozen;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: empty line

@costin costin merged commit 4521ca3 into elastic:master Jun 27, 2020
@costin costin deleted the eql/head-tail branch June 27, 2020 06:08
costin added a commit that referenced this pull request Jun 27, 2020
Introduce pipe support, in particular head and tail
(which can also be chained).

(cherry picked from commit 4521ca3)
(cherry picked from commit d6731d6)
@costin costin mentioned this pull request Jul 2, 2020
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/EQL EQL querying Team:QL (Deprecated) Meta label for query languages team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants