Skip to content

Add new track for tsdb based on k8s integration #373

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 97 commits into from
Jun 9, 2023
Merged
Show file tree
Hide file tree
Changes from 93 commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
2bbcc33
wip
martijnvg Feb 2, 2023
f66f0a4
iter
martijnvg Feb 2, 2023
3de4606
Updating track.json and added sample 1k file for test
gizas Feb 2, 2023
49d82d4
fixed malformed template
martijnvg Feb 2, 2023
3948741
iter
martijnvg Feb 2, 2023
73e5cb7
Updating track with latest corpora
gizas Feb 6, 2023
91fc6ac
Updating track with latest corpora
gizas Feb 6, 2023
1635469
Merge remote-tracking branch 'origin/master' into tsdb2
pquentin Feb 16, 2023
9f73e5e
update corpus name
martijnvg Feb 23, 2023
6002bd3
Merge remote-tracking branch 'es/master' into tsdb2
martijnvg Feb 23, 2023
ae8273c
Updating corpora url
gizas Feb 23, 2023
1ecda50
update mapping
martijnvg Feb 24, 2023
762b6d5
update mapping
martijnvg Feb 24, 2023
fdc4a6e
added two queries
martijnvg Feb 24, 2023
6839b89
add index_mode parameter
martijnvg Mar 7, 2023
c0b8d65
iter
martijnvg Mar 7, 2023
7525a09
adjust time range queries
martijnvg Mar 8, 2023
810e3c4
moar searches
martijnvg Mar 8, 2023
b779e34
Merge remote-tracking branch 'origin/master' into tsdb2
pquentin Mar 9, 2023
f02579c
change warmup-iterations to try to see whether is time spent in globa…
martijnvg Mar 20, 2023
34980e6
Adding indices and mappings for containers, pods and nodes
gizas Mar 23, 2023
d905721
Updating challenges with latest queries for 1day with correct timestamps
gizas Mar 29, 2023
976d2de
updated track, running in test mode works
martijnvg Mar 31, 2023
90b1dfb
small tweak
martijnvg Mar 31, 2023
c058ed0
index task per corpora
martijnvg Apr 3, 2023
de77e53
iter
martijnvg Apr 3, 2023
b1121ba
Adding new challenges to compare improvements with previosu visualisa…
gizas Apr 4, 2023
77bdda0
fixed syntax errors
martijnvg Apr 4, 2023
349eb89
Fixing error in updated_average_container_memory_usage_1d
gizas Apr 4, 2023
0762e04
increased bulk size
martijnvg Apr 4, 2023
0c7f452
fixed syntax
martijnvg Apr 4, 2023
85d3ad1
parallelize searching and indexing
martijnvg Apr 5, 2023
c23a05c
increased touched* iterations
martijnvg Apr 5, 2023
552b878
alter time range
martijnvg Apr 11, 2023
15c2eac
refresh after each touch
martijnvg Apr 11, 2023
8e9aa99
typo
martijnvg Apr 11, 2023
4371997
fixed mistake
martijnvg Apr 11, 2023
3d7f243
removed old queries
martijnvg Apr 12, 2023
e73a21e
Adding two more queries the Pod status and the network usage per pod
gizas Apr 12, 2023
9a0396e
fixed interval mistake
martijnvg Apr 12, 2023
f1625d5
Adding two more queries the Pod status and the network usage per pod
gizas Apr 12, 2023
d26fc82
Merge branch 'tsdb2' of github.com:martijnvg/rally-tracks into tsdb2
gizas Apr 12, 2023
b5c1e92
Updating two more queries with 15min
gizas Apr 12, 2023
bc46a0d
template lte part of range queries
martijnvg Apr 12, 2023
9765061
fixed searches
martijnvg Apr 12, 2023
1349bf8
first attempt at templating 2 queries
martijnvg Apr 12, 2023
483c1ae
templated more queries
martijnvg Apr 12, 2023
f6e9d60
template all queries
martijnvg Apr 12, 2023
404c608
hook all searches up
martijnvg Apr 12, 2023
4450708
use bulk to touch container / pod data streams
martijnvg Apr 13, 2023
0ee089d
hdr test
martijnvg Apr 19, 2023
490717a
Merge remote-tracking branch 'es/master' into tsdb2
martijnvg May 2, 2023
fc3c2fc
use correct pattern
martijnvg May 2, 2023
851cfb5
added variables
martijnvg May 2, 2023
1e89689
reduce LOJ
martijnvg May 2, 2023
25fee34
Adding mappings of 1.35.0 and also track from generator tool
gizas May 5, 2023
82a89d8
Adding elastic-integration-corpus-generator-tool config
gizas May 5, 2023
d7940b0
Adding mappings of 1.36.0
gizas May 5, 2023
6764d0a
Adding missing comma
gizas May 8, 2023
161b0bc
Updating templates with 1.36.0
gizas May 8, 2023
f9432de
updated corpera
martijnvg May 8, 2023
ad245a8
Updating with new autogeenrated corpus for container
gizas May 9, 2023
aa6d722
Revert "hdr test"
martijnvg May 10, 2023
fdae65c
Updating with corpus under rally2
gizas May 16, 2023
c10560c
Fixing error with : in timestamp
gizas May 17, 2023
cb09497
updated timestamps for in queries
martijnvg May 23, 2023
fa42a77
Added target-interval and index.search.idle.after to simulate shards …
martijnvg May 23, 2023
20dfbb3
removed k8s-node data set
martijnvg May 24, 2023
164de9c
Merge remote-tracking branch 'es/master' into tsdb2
martijnvg May 24, 2023
e2e107e
added README
martijnvg May 24, 2023
ff0833e
Updating with fixes in container dataset field position and small upd…
gizas May 24, 2023
416c3fa
updated README
martijnvg May 25, 2023
7c3f611
Updating Readme with commands of index template instructions
gizas May 25, 2023
b530f68
update readme
martijnvg May 25, 2023
5f6fb53
Merge remote-tracking branch 'mvg/tsdb2' into tsdb2
martijnvg May 25, 2023
c7ae39a
renamed track directory
martijnvg May 25, 2023
7866500
correctly rename it...
martijnvg May 25, 2023
2317fb9
removed index_mode track parameter
martijnvg May 25, 2023
064e20c
increased bulk size
martijnvg May 25, 2023
dfe8900
Updating Readme with commands of index template instructions
gizas May 25, 2023
26f3c17
update constants
martijnvg May 25, 2023
e4670b1
reorder alinea
martijnvg May 25, 2023
8154ff5
alter bulk_size
martijnvg May 25, 2023
c62f73e
update default bulk_indexing_clients
martijnvg May 25, 2023
0c5af90
alter bulk indexing default to avoid 429 during track execution
martijnvg May 26, 2023
7b428e3
fixed typo
martijnvg May 26, 2023
e15eba3
set search_warmup_iterations to 50
martijnvg May 26, 2023
f249aca
tweak search_iterations and target_interval
martijnvg May 26, 2023
d209ee5
readme
martijnvg Jun 5, 2023
703391b
readme 2
martijnvg Jun 5, 2023
718539f
readme 3
martijnvg Jun 5, 2023
7e936af
readme 4
martijnvg Jun 5, 2023
0fcee2c
added number_of_shards and number_of_replicas
martijnvg Jun 5, 2023
a7557ae
updated README
martijnvg Jun 6, 2023
ddf27d7
fix spelling error
martijnvg Jun 6, 2023
d6e8179
Merge remote-tracking branch 'es/master' into tsdb2
martijnvg Jun 7, 2023
5ea2206
move constants to track.json
martijnvg Jun 7, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 91 additions & 0 deletions tsdb_k8s_queries/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
## TSDB k8s query track

The main goal of the TSDB k8s query track is to measure the performance of common k8s integration search requests.
The queries, index templates and corpus data try to match production as close as possible.

### The search requests

The search requests of the following visualizations have been included in this track:
* Average CPU usage per pod.
* Average memory usage per pod.
* Average cpu per container.
* Average memory per container.
* Average node cpu usage per container.
* Average node memory usage per container.
* Unique deployment count.
* Percentile cpu usage per container.
* Status per pod.
* Transmitted network usage per pod.

The search request is sourced from Kibana Lens in the k8s integration. For each of this search request a template is defined in operations.
From each template, three search requests are generated. First with the last 15 minutes filter, then the last 2 hours filter, and finally with a last 24-hour interval. In case a `date_histogram` is used in the query template then each of these variations uses a different interval. A fixed interval of 30 seconds, 1 minute, and 30 minutes respectively.

The k8s visualizations that run these queries don't run very often or under a high query load.
Often these visualizations are loaded and then sometime later re-loaded. This triggers the shards of the k8s pod and container data streams to go search-idle. However, indexing always happens in the background. When shards become search-active again, a refresh needs to occur as part of the search request. This track is designed to emulate this runtime behaviour.

This is done by concurrently indexing while the searches are ran. By lowering the `index.search.idle.after` setting from 30s (which is the default) to 1s. And force fully reducing the query load, so that one search gets executed every 3 seconds.

When the data sets are updated, the `end_time` constant and `time_intervals` dictionary in the `operations/default.json` file must be updated.
The `end_time` should match with the `@timestamp` field of latest document in the data files. Note that this timestamp should be the same for both pod and container k8s data sets.
The `time_intervals` dictonary control for each search request template fixed interval used in date_histogram and the range query on the `@timestamp` field.
The timestamp the dictonary value entry need to be updated based on what the `end_time` has been set. For example the value for `15_minutes` key the timestamp shoud be `end_time` - 15 minutes. Note that Kibana doesn't use dath math and therefor these queries don't use that too. This to emulate production as close as possible.

### Generation of data

New Corpora data can be generated with the help of [Elastic-integration-corpus-generator-tool How_to_Guide](https://github.com/elastic/observability-dev/blob/main/docs/infraobs/cloudnative-monitoring/dev-docs/elastic-generator-tool-with-rally.md).

Specific templates are implemented as part of the generator tool. Based on them, sample datasets needed for the rally track have already been generated and uploaded to public GCP bucket for reuse.

### Generation of Templates

Along with generation of data, a rally track might also need updated template files for the specific datasets that will index.

Follow below process to extract the latest index templates for specific package version:

1. Create local Elastic stack

```bash
elastic-package stack up -d -vvv --version=8.7.1
```

1. Login to Kibana (https://localhost:5601) and install specific integration. `Eg. Kubernetes Integration v1.39.1`
The installation of package will install index templates in local Elasticseacrch.

2. Export needed environmental variables

```bash
elastic-package stack shellinit
export ELASTIC_PACKAGE_ELASTICSEARCH_HOST=https://127.0.0.1:9200
export ELASTIC_PACKAGE_ELASTICSEARCH_USERNAME=elastic
export ELASTIC_PACKAGE_ELASTICSEARCH_PASSWORD=changeme
export ELASTIC_PACKAGE_KIBANA_HOST=https://127.0.0.1:5601
export ELASTIC_PACKAGE_CA_CERT=<home_path>/elastic-package/profiles/default/certs/ca-cert.pem
```

4. Use elastic-package dump command:

```bash
elastic-package dump installed-objects --package kubernetes

[output] ...
packages exttracted to package-dump
```

5. Locate and use your needed index templates

```bash
cd package-dump/index_templates
- metrics-kubernetes.pod.json
- metrics-kubernetes.container.json
```

### Parameters

This track allows to overwrite the following parameters using `--track-params`:

* `bulk_size` (default: 9000)
* `bulk_indexing_clients` (default: 8): Number of clients that issue bulk indexing requests.
* `ingest_percentage` (default: 100): A number between 0 and 100 that defines how much of the document corpus should be ingested.
* `force_merge_max_num_segments` (default: unset): An integer specifying the max amount of segments the force-merge operation should use.
* `number_of_replicas` (default: 0)
* `number_of_shards` (default: 1)
306 changes: 306 additions & 0 deletions tsdb_k8s_queries/challenges/default.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,306 @@
{% set search_iterations = 20 %}
{% set search_warmup_iterations = 50 %}
{% set target_interval = 4 %}
{
"name": "append-no-conflicts",
"description": "Indexes the whole document corpus.",
"default": true,
"schedule": [
{
"name": "create-all-templates",
"operation": {
"operation-type": "create-composable-template",
"request-params": {
"create": "true"
}
}
},
{
"name": "put-timestamp-pipeline",
"operation": {
"operation-type": "put-pipeline",
"id": "timestamp_pipeline",
"body": {
"processors": [
{
"set": {
"field": "@timestamp",
"value": {{'"{{_ingest.timestamp}}"'}}
}
}
]
}
}
},
{
"name": "check-cluster-health",
"operation": {
"operation-type": "cluster-health",
"request-params": {
"wait_for_status": "{{cluster_health | default('green')}}",
"wait_for_no_relocating_shards": "true"
},
"retry-until-success": true
}
},
{
"operation": "index-container",
"warmup-time-period": 240,
"clients": {{bulk_indexing_clients | default(8)}}
},
{
"operation": "index-pod",
"warmup-time-period": 240,
"clients": {{bulk_indexing_clients | default(8)}}
},
{
"name": "refresh-after-index",
"operation": "refresh"
},
{
"operation": {
"operation-type": "force-merge",
"request-timeout": 7200{%- if force_merge_max_num_segments is defined %},
"max-num-segments": {{ force_merge_max_num_segments | tojson }}
{%- endif %}
}
},
{
"name": "wait-until-merges-finish",
"operation": {
"operation-type": "index-stats",
"index": "_all",
"condition": {
"path": "_all.total.merges.current",
"expected-value": 0
},
"retry-until-success": true,
"include-in-reporting": false
}
},
{
"name": "refresh-after-force-merge",
"operation": "refresh"
},
{% for name, info in time_intervals.items() %}
{
"parallel": {
"completed-by": "cpu_usage_per_pod_{{name}}",
"clients": 2,
"tasks": [
{
"name": "touch-pod-1-{{name}}",
"operation": "touch-pod-index",
"clients": 1
},
{
"operation": "cpu_usage_per_pod_{{name}}",
"warmup-iterations": {{search_warmup_iterations}},
"iterations": {{search_iterations}},
"target-interval": {{target_interval}},
"clients": 1
}
]
}
},
{%- endfor %}
{% for name, info in time_intervals.items() %}
{
"parallel": {
"completed-by": "memory_usage_per_pod_{{name}}",
"clients": 2,
"tasks": [
{
"name": "touch-pod-2-{{name}}",
"operation": "touch-pod-index",
"clients": 1
},
{
"operation": "memory_usage_per_pod_{{name}}",
"warmup-iterations": {{search_warmup_iterations}},
"iterations": {{search_iterations}},
"target-interval": {{target_interval}},
"clients": 1
}
]
}
},
{%- endfor %}
{% for name, info in time_intervals.items() %}
{
"parallel": {
"completed-by": "status_per_pod_{{name}}",
"clients": 2,
"tasks": [
{
"name": "touch-pod-3-{{name}}",
"operation": "touch-pod-index",
"clients": 1
},
{
"operation": "status_per_pod_{{name}}",
"warmup-iterations": {{search_warmup_iterations}},
"iterations": {{search_iterations}},
"target-interval": {{target_interval}},
"clients": 1
}
]
}
},
{%- endfor %}
{% for name, info in time_intervals.items() %}
{
"parallel": {
"completed-by": "tx_network_usage_per_pod_{{name}}",
"clients": 2,
"tasks": [
{
"name": "touch-pod-4-{{name}}",
"operation": "touch-pod-index",
"clients": 1
},
{
"operation": "tx_network_usage_per_pod_{{name}}",
"warmup-iterations": {{search_warmup_iterations}},
"iterations": {{search_iterations}},
"target-interval": {{target_interval}},
"clients": 1
}
]
}
},
{%- endfor %}
{% for name, info in time_intervals.items() %}
{
"parallel": {
"completed-by": "average_container_cpu_{{name}}",
"clients": 2,
"tasks": [
{
"name": "touch-container-1-{{name}}",
"operation": "touch-container-index",
"clients": 1
},
{
"operation": "average_container_cpu_{{name}}",
"warmup-iterations": {{search_warmup_iterations}},
"iterations": {{search_iterations}},
"target-interval": {{target_interval}},
"clients": 1
}
]
}
},
{%- endfor %}
{% for name, info in time_intervals.items() %}
{
"parallel": {
"completed-by": "average_container_memory_usage_{{name}}",
"clients": 2,
"tasks": [
{
"name": "touch-container-2-{{name}}",
"operation": "touch-container-index",
"clients": 1
},
{
"operation": "average_container_memory_usage_{{name}}",
"warmup-iterations": {{search_warmup_iterations}},
"iterations": {{search_iterations}},
"target-interval": {{target_interval}},
"clients": 1
}
]
}
},
{%- endfor %}
{% for name, info in time_intervals.items() %}
{
"parallel": {
"completed-by": "cpu_usage_per_container_{{name}}",
"clients": 2,
"tasks": [
{
"name": "touch-container-3-{{name}}",
"operation": "touch-container-index",
"clients": 1
},
{
"operation": "cpu_usage_per_container_{{name}}",
"warmup-iterations": {{search_warmup_iterations}},
"iterations": {{search_iterations}},
"target-interval": {{target_interval}},
"clients": 1
}
]
}
},
{%- endfor %}
{% for name, info in time_intervals.items() %}
{
"parallel": {
"completed-by": "memory_usage_per_container_{{name}}",
"clients": 2,
"tasks": [
{
"name": "touch-container-4-{{name}}",
"operation": "touch-container-index",
"clients": 1
},
{
"operation": "memory_usage_per_container_{{name}}",
"warmup-iterations": {{search_warmup_iterations}},
"iterations": {{search_iterations}},
"target-interval": {{target_interval}},
"clients": 1
}
]
}
},
{%- endfor %}
{% for name, info in time_intervals.items() %}
{
"parallel": {
"completed-by": "unique_deployment_count_{{name}}",
"clients": 2,
"tasks": [
{
"name": "touch-pod-5-{{name}}",
"operation": "touch-pod-index",
"clients": 1
},
{
"operation": "unique_deployment_count_{{name}}",
"warmup-iterations": {{search_warmup_iterations}},
"iterations": {{search_iterations}},
"target-interval": {{target_interval}},
"clients": 1
}
]
}
},
{%- endfor %}
{% for name, info in time_intervals.items() %}
{
"parallel": {
"completed-by": "percentile_cpu_usage_per_container_{{name}}",
"clients": 2,
"tasks": [
{
"name": "touch-container-5-{{name}}",
"operation": "touch-container-index",
"clients": 1
},
{
"operation": "percentile_cpu_usage_per_container_{{name}}",
"warmup-iterations": {{search_warmup_iterations}},
"iterations": {{search_iterations}},
"target-interval": {{target_interval}},
"clients": 1
}
]
}
}{{ ", " if not loop.last else "" }}
{%- endfor %}
]
}
Loading