Skip to content

Comments

Backport/8.15/pr 675#694

Closed
achuguy wants to merge 46 commits intoelastic:masterfrom
achuguy:backport/8.15/pr-675
Closed

Backport/8.15/pr 675#694
achuguy wants to merge 46 commits intoelastic:masterfrom
achuguy:backport/8.15/pr-675

Conversation

@achuguy
Copy link
Contributor

@achuguy achuguy commented Oct 17, 2024

Backport #675 to 8.15

martijnvg and others added 30 commits June 28, 2024 16:58
…o elastic/logs track (elastic#622) (elastic#625)

Backporting elastic#622 to the 8.15 branch. Otherwise rally nightly and esbench will not pick the change up. The version of Elasticsearch main is 8.15.0-SNAPSHOT and therefor rally nightly / esbench will use the rally track's 8.15 branch.

These search are based from searches that are executed by search/discovery search challenge that fetch top documents. The queries are without query and fetch 100, 500 and 1000 documents. The source is fetches using the field fetch feature, like in the searches that execute as part of search/discovery workflow.

The problem is that we see combined latency and service time for search/discovery challenge and not for each search that is executed as part of this challenge (it is composite operation).

By adding these searches we can get latency and service time of searches that specifically fetch _source. This information is useful for the logsdb effort. Synthetic source makes fetching _source more expensive, but currently we can't introspect at a closer level what the impact is (since search/discovery search challenge report latency / service time for multiple operations).
Update integrations for elastic/logs to 8.13.3
…ck (elastic#640) (elastic#641)

The index_mode parameter will be used to run Rally benchmarks comparing
indexing using standard and logsdb mode for the elastic/security track.

Enabling LogsDB is done by means of a component template which is added and
later used if the index_mode is provided. In case it is missing no index mode
will be used which will default to standard.
…lastic#643)

The fleet component template is used when we try to delete it.
Here we introduce a parameter that allows us to skip deletion
of the component template. The default value is false, which
means normally we attempt to delete it. Setting it explicitly
to true we avoid deleting it. This prevents errors happening
if we try to delete it and it is in use.
Serverless deployments miss ILM. As a result component templates
should not use the lifecycle setting. Here we introduce a setting
which allows us to exclude the lifecycle setting either using
`lifecycle` parameter or a `build_flavor` parameter. This mimics
what we do already for the elastic/logs track.
Backport to 8.15:

- Keep array source in logsdb mode (655)
* Paramaterise timeout

* Update README.md

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
…tic#664)

From ES v8.14 the default index type for dense_vectors is int8_hnsw.
This modifies our rally tracks to refect it.
`copy_to` is used to copy from `kubernetes.event.message` to `message`.
Now it is supported in Elasticsearch 8.15 and we can benchmark the security
track including it. We also remove a parameter which was used to run a modified
workflow, which was using `kubernetes.event.message` instead of `message`.
This PR changes the security track so that we can enable LogsDB
in index templates. Note that the failure store is only available in serverless
so we gate its usage excluding it in case the deployment is not serverless.

For LogsDB testing we rely on Kibana to install all other component/composable
templates. This is to make sure we need limited changes to the Rally track.

While testing this new configuration we discovered that installation of (component)
templates done by Kibana is Serverless only happens when a user interacts with it.
This means (component) templates are not installed and the `elastic/security` track
execution fails as a result of using (component) templates that do not exist.
This back ports elastic#672 to 8.15 branch.

* `enable_logsdb` (default: false) Determines whether the logsdb index mode gets used. If set then index sorting is configured to only use `@timestamp` field and the `source_enabled` parameter will have no effect.
* `force_merge_max_num_segments` (default: unset): An integer specifying the max amount of segments the force-merge operation should use.
…astic#679)

If the `host.name` field does not exists, indices created as backing indices of a data stream
are injected with empty values of `host.name`. Sorting on `host.name` and `@timestamp`
results in sorting just on `@timestamp`. Looking at some mappings I see a `host.hostname`
exists. Also a cardinality aggregation results in hundreds of distinct values which suggests
the filed is not empty.

We would like to test using a meaningful combination of fields to sort on. Ideally we expect
better benchmark results despite being possible that other, more effective, combinations of
fields might exist. We are interested, anyway, in changes over time **given a valid set of fields
to sort on**.

(cherry picked from commit 0ca00a0)
(cherry picked from commit 3ae3304)

Co-authored-by: Gareth Ellis <gareth.ellis@elastic.co>
…tic#682) (elastic#684)

This PR introduces a new track parameter, `synthetic_source_keep` which is used to control the
behaviour of synthetic source for all field types. It can have values `none`, `arrays` or `all` (`all`
not usable when set at index level).
See elastic/elasticsearch#112706 to understand the effect of each value.

Later on we will use this to change the behaviour in our nightlies and run benchmarks on both `elastic/logs`
and `elastic/security` using value `arrays`.
…lastic#683) (elastic#685)

Backporting elastic#683 to 8.15 branch.

The addition of the index.mapping.synthetic_source_keep to tsdb is new. To http_logs is not and before the index.mapping.synthetic_source_keep setting was hard coded to arrays. I will open a separate PR that adds the source_keep track param to nightly configs.

Having the source_keep makes comparing benchmark results between the different source keep options easier.
(cherry picked from commit 4493616)

Co-authored-by: Grzegorz Banasiak <grzegorz.banasiak@elastic.co>
Remove unparsable field from logs@default-pipeline.json
@achuguy achuguy added the backport This PR is a backport of some other PR label Oct 17, 2024
@achuguy achuguy self-assigned this Oct 17, 2024
@achuguy achuguy closed this Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport This PR is a backport of some other PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants