Add new challenge to elastic/logs to benchmark the INSIST_🐔 esql command#801
Add new challenge to elastic/logs to benchmark the INSIST_🐔 esql command#801martijnvg merged 12 commits intoelastic:masterfrom
Conversation
This controls whether all fields are mapped or whether almost all fields are unmapped.
|
I did initial run of this new challenge: Indexing throughput looks reasonable, but the |
|
Did another run with latest version (this time using 1 node and no replicas, previous was with 3 nodes and 1 replica): |
| "operation": "chicken_1", | ||
| "clients": {{ p_search_clients }}, | ||
| "warmup-iterations": {{ warmup_iterations | default(3) }}, | ||
| "iterations": {{ iterations | default(5) }}, |
There was a problem hiding this comment.
lowered default iterations and warmup_iterations significantly, given that insist command is very slow, otherwise it takes multiple days to complete the benchmark.
This will be restored when performance for insist command improves.
|
Note that previous runs ran with mapped variant, all fields were mapped. Running with unmapped, the insist command is many orders of magnitude slower. |
flash1293
left a comment
There was a problem hiding this comment.
Left some smaller questions, but not blocking
| { | ||
| "name": "chicken_4", | ||
| "operation-type": "esql", | ||
| "query": "FROM logs-* | INSIST_🐔 agent.hostname | EVAL col0 = COALESCE(agent.hostname, \"elasticsearch-ci-immutable-centos-7-1599241536066250344\") | WHERE col0 == \"elasticsearch-ci-immutable-centos-7-1599241536066250344\"" |
There was a problem hiding this comment.
Why are you coalescing with \"elasticsearch-ci-immutable-centos-7-1599241536066250344\" here - shouldn't it be FROM logs-* | INSIST_🐔 agent.hostname | EVAL col0 = COALESCE(agent.hostname, \"\") | WHERE col0 == \"elasticsearch-ci-immutable-centos-7-1599241536066250344\" ?
There was a problem hiding this comment.
Not sure, I think this what I copied from: https://github.com/elastic/streams-program/discussions/320#discussioncomment-13583964
In anyway, I will update this.
| { | ||
| "name": "chicken_2", | ||
| "operation-type": "esql", | ||
| "query": "FROM logs-* | INSIST_🐔 event.dataset,kubernetes.container.image | EVAL col0 = COALESCE(kubernetes.container.image, \"\") | WHERE col0 != \"\" | STATS col1 = COUNT() BY col0,data_stream.dataset" |
There was a problem hiding this comment.
why do we insist event.dataset here?
There was a problem hiding this comment.
Good point. That got in by mistake. I will update.
flash1293
left a comment
There was a problem hiding this comment.
LGTM. Maybe we should leave a note somewhere that the coalesce calls can be removed once elastic/elasticsearch#130220 is fixed. Otherwise people will wonder where it comes from
|
Successful run based on most recent commit: This ran with one node and can be compared with: #801 (comment) |
|
@martijnvg
Backporting entails:
Thank you! |
Also add
mappingtrack parameter which controls whether all fields are mapped or whether almost all fields are unmapped. This allows for benchmarking elastic/logs in an unmapped context with experimental INSIST_🐔 esql command.