Skip to content

Commit b69865f

Browse files
committed
Locust: Implement suggestions by CodeRabbit
1 parent 870bc23 commit b69865f

File tree

2 files changed

+53
-50
lines changed

2 files changed

+53
-50
lines changed

docs/integrate/locust/index.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313

1414
[Locust] is an open source load testing tool.
1515

16-
> Define user behaviour with Python code, and swarm your system with
16+
> Define user behavior with Python code, and swarm your system with
1717
> millions of simultaneous users.
1818
1919
:::{rubric} Learn
@@ -24,9 +24,8 @@
2424
:::{grid-item-card} Load testing CrateDB using Locust
2525
:link: locust-tutorial
2626
:link-type: ref
27-
Learn how to use Locust as the framework to run load tests with
28-
a customizable set of SQL statements against CrateDB to measure
29-
its performance.
27+
Use Locust to run load tests with a customizable set of SQL statements
28+
against CrateDB and measure performance.
3029
:::
3130

3231
::::

docs/integrate/locust/tutorial.md

Lines changed: 50 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -3,25 +3,16 @@
33

44
## Introduction
55

6-
As with every other database, users want to run performance tests to get a feel for the performance of their workload.
6+
Like with any database, you’ll want to run performance tests to understand
7+
your workload’s behavior.
78

89
CrateDB offers a couple of tools that can be used for specific use cases. For example, the [nodeIngestBench][] allows you to run high-performance ingest benchmarks against a CrateDB cluster or use the [TimeIt][] function within the cr8 toolkit to measure the runtime of a given SQL statement on a cluster.
910

10-
[nodeIngestBench]: https://github.com/proddata/nodeIngestBench
11-
[TimeIt]: https://github.com/mfussenegger/cr8#timeit
12-
13-
We use Locust as the framework to run load tests with a customizable set of SQL statements. [Locust][] is a great, flexible, open-source (Python) framework that can swarm the database with users and get the RPS (request per second) for different queries. This small blog shows how to use Locust to load test CrateDB in your environment.
11+
Use Locust to run load tests with a customizable set of SQL statements. [Locust][] is a flexible, open‑source Python framework that can swarm the database with users and report RPS (requests per second) per query. This tutorial shows how to use Locust to load test CrateDB in your environment.
1412

15-
For this blog, I’m running a 3-node cluster created in a local docker environment as described in this [tutorial][].
16-
17-
[tutorial]: https://cratedb.com/docs/crate/tutorials/en/latest/containers/docker.html
18-
[Locust]: https://locust.io
13+
For this tutorial, we use a 3‑node local Docker cluster (see this [tutorial][]).
1914

20-
First, we must set up the data model and load some data. I’m using [DBeaver][] to connect in this case, but this can be done by either the [CrateDB CLI tools][] or the Admin UI that comes with either the self- or [fully-managed][] CrateDB solution.
21-
22-
[DBeaver]: https://dbeaver.io
23-
[CrateDB CLI tools]: https://cratedb.com/docs/crate/clients-tools/en/latest/connect/cli.html#cli
24-
[fully-managed]: https://console.cratedb.cloud/
15+
First, set up the data model and load data. This example uses [DBeaver][], but you can also use the [CrateDB CLI tools][] or the Admin UI in self‑managed or [fully-managed][] CrateDB.
2516

2617
Create the following tables:
2718

@@ -45,11 +36,11 @@ CREATE TABLE IF NOT EXISTS "weekly_aggr_weather_data"(
4536
);
4637
```
4738

48-
Create the user used further down the line.
39+
Create the user for the load test.
4940
```sql
50-
CREATE USER locust with (password = 'load_test');
51-
GRANT ALL PRIVILEGES ON table weather_data to locust;
52-
GRANT ALL PRIVILEGES ON table weekly_aggr_weather_data to locust;
41+
CREATE USER locust WITH (password = 'load_test');
42+
GRANT ALL PRIVILEGES ON table weather_data TO locust;
43+
GRANT ALL PRIVILEGES ON table weekly_aggr_weather_data TO locust;
5344
```
5445

5546
Load some data into the `weather_data` table by using the following statement.
@@ -60,7 +51,7 @@ FROM 'https://github.com/crate/cratedb-datasets/raw/main/cloud-tutorials/data_we
6051
WITH (format = 'csv', compression = 'gzip', empty_string_as_null = true);
6152
```
6253

63-
The `weather_data` table should now have 70k rows of data.
54+
The `weather_data` table now contains roughly 70k rows.
6455

6556
```text
6657
select count(*) from weather_data;
@@ -70,11 +61,11 @@ count(*)|
7061
70000|
7162
```
7263

73-
We leave the other table empty as that one will be populated as part of the load test.
64+
Leave `weekly_aggr_weather_data` empty; the load test populates it.
7465

7566
## Install Locust
7667

77-
In this case, I installed Locust on my Mac, but in an acceptance environment, you probably want to run this Locust on one or more driver machines. Especially when you want to push the database, you will need enough firepower on the driver side to push the database.
68+
Install Locust locally for a quick start. In staging or production‑like testing, run Locust on one or more driver machines to generate sufficient load.
7869

7970
On Python (3.9 or later), install Locust as well as the CrateDB driver:
8071
```bash
@@ -89,8 +80,10 @@ locust -V
8980

9081
## Run Locust
9182

92-
Start with a simple test to ensure the connectivity is there and you can connect to the database. Copy the code below and write to a file named `locustfile.py`.
93-
Besides the pure Locust execution, it also contains a CrateDB-specific implementation, connecting to CrateDB using our Python driver, instead of a plain HTTP client.
83+
Start with a simple connectivity check.
84+
Copy the code below into a file named `locustfile.py`.
85+
It uses a CrateDB-specific client built on the Python driver rather than
86+
a generic HTTP client.
9487

9588
```python
9689
import time
@@ -119,13 +112,13 @@ class CrateDBClient:
119112
)
120113
self._request_event = request_event
121114

122-
def send_query(self, *args, **kwargs):
115+
def send_query(self, sql, name, params=None):
123116
cursor = self._connection.cursor()
124117
start_time = time.perf_counter()
125118

126119
request_meta = {
127120
"request_type": "CrateDB",
128-
"name": args[1],
121+
"name": name,
129122
"response_length": 0,
130123
"response": None,
131124
"context": {},
@@ -134,15 +127,15 @@ class CrateDBClient:
134127

135128
response = None
136129
try:
137-
cursor.execute(args[0])
130+
cursor.execute(sql, params or ())
138131
response = cursor.fetchall()
139132
except Exception as e:
140133
request_meta["exception"] = e
141134

142135
request_meta["response_time"] = (time.perf_counter() - start_time) * 1000
143136
request_meta["response"] = response
144137
# Approximate length, we don't have the original HTTP response body any more
145-
request_meta["response_length"] = len(str(response))
138+
request_meta["response_length"] = len(response) if response is not None else 0
146139

147140
# This is what makes the request actually get logged in Locust
148141
self._request_event.fire(**request_meta)
@@ -178,7 +171,7 @@ Some explanation on some of the code above ☝️
178171

179172
The class `CrateDBClient` implements how to connect to CrateDB and details on how to measure requests. `CrateDBUser` represents a Locust-generated user based on the `CrateDBClient`.
180173

181-
In the actual Locust configuration, with the `wait_time = between (1, 5)`, you can control the number of queries and the randomization of the queries by using between. This will execute the different queries with a random interval between 1 and 5 sec. Another option that will give you more control over the amount of executed queries per second is using the `wait_time = constant_throughput(1.0)`, which will execute 1 of the queries per second for every user, or if you set it to `(2.0)`, will execute two queries every second.
174+
In Locust, `wait_time = between(1, 5)` randomizes task execution between 1 and 5 seconds. To control throughput more precisely, use `wait_time = constant_throughput(1.0)`, which runs one task per second per user (set to `2.0` for two tasks per second).
182175

183176
For every query you want to include in your test, you will need to create a block like this:
184177

@@ -205,11 +198,11 @@ Define the number of users and the spawn rate. As this is an initial test, we le
205198

206199
![Start new load test|272x500](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/d/d61218208d3f11d27d398e87c3954cb4327c9910.png){h=320px}
207200

208-
Click Start to start the load test.
201+
Click "Start" to launch the load test.
209202

210203
![swarm-query0|690x133](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/5/56615b02ec7a792326acb1d5dc086f5d7636bbdb.png)
211204

212-
As you can see, is 1 query being executed with an RPS of 1. The number of failures should be 0. If you stop the test and start a New test with ten users, you should get an RPS of 10.
205+
Locust executes one query at ~1 RPS (requests per second) with zero failures. If you stop and start a new test with 10 users, you’ll see ~10 RPS.
213206

214207
![swarm-10users-query0|690x133](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/2/2ae79d623b440d1735df0285fd0bf85996623bd2.png)
215208

@@ -221,7 +214,7 @@ SELECT location, round(AVG(temperature)) AS avg_temp
221214
FROM weather_data
222215
WHERE location = 'CITY'
223216
GROUP BY location
224-
ORDER BY 2 DESC;
217+
ORDER BY avg_temp DESC;
225218

226219
-- When was Max Temp
227220
SELECT location,
@@ -264,7 +257,7 @@ SELECT a.timestamp,
264257
FROM weather_data a, minmax b
265258
WHERE a.location = b.location
266259
AND a.timestamp BETWEEN b.mintstamp AND b.maxtstamp
267-
ORDER BY 1;
260+
ORDER BY a.timestamp;
268261

269262
-- Upsert the Aggr per week
270263
INSERT INTO weekly_aggr_weather_data (week, location, avgtemp, maxhumid, minwind, lastupdated)
@@ -317,13 +310,13 @@ class CrateDBClient:
317310
)
318311
self._request_event = request_event
319312

320-
def send_query(self, *args, **kwargs):
313+
def send_query(self, sql, name, params=None):
321314
cursor = self._connection.cursor()
322315
start_time = time.perf_counter()
323316

324317
request_meta = {
325318
"request_type": "CrateDB",
326-
"name": args[1],
319+
"name": name,
327320
"response_length": 0,
328321
"response": None,
329322
"context": {},
@@ -332,15 +325,15 @@ class CrateDBClient:
332325

333326
response = None
334327
try:
335-
cursor.execute(args[0])
328+
cursor.execute(sql, params or ())
336329
response = cursor.fetchall()
337330
except Exception as e:
338331
request_meta["exception"] = e
339332

340333
request_meta["response_time"] = (time.perf_counter() - start_time) * 1000
341334
request_meta["response"] = response
342335
# Approximate length, we don't have the original HTTP response body any more
343-
request_meta["response_length"] = len(str(response))
336+
request_meta["response_length"] = len(response) if response is not None else 0
344337

345338
# This is what makes the request actually get logged in Locust
346339
self._request_event.fire(**request_meta)
@@ -368,15 +361,17 @@ class QuickstartUser(CrateDBUser):
368361

369362
@task(5)
370363
def query01(self):
364+
city = random.choice(self.cities)
371365
self.client.send_query(
372-
f"""
366+
"""
373367
SELECT location, ROUND(AVG(temperature)) AS avg_temp
374368
FROM weather_data
375-
WHERE location = '{random.choice(self.cities)}'
369+
WHERE location = ?
376370
GROUP BY location
377-
ORDER BY 2 DESC
371+
ORDER BY avg_temp DESC
378372
""",
379373
"Avg Temperature per City",
374+
params=(city,),
380375
)
381376

382377
@task(1)
@@ -417,14 +412,15 @@ class QuickstartUser(CrateDBUser):
417412

418413
@task(5)
419414
def query04(self):
415+
city = random.choice(self.cities)
420416
self.client.send_query(
421-
f"""
417+
"""
422418
WITH minmax AS (
423419
SELECT location,
424420
MIN(timestamp) AS mintstamp,
425421
MAX(timestamp) AS maxtstamp
426422
FROM weather_data
427-
WHERE location = '{random.choice(self.cities)}'
423+
WHERE location = ?
428424
GROUP BY location
429425
)
430426
SELECT a.timestamp,
@@ -435,9 +431,10 @@ class QuickstartUser(CrateDBUser):
435431
FROM weather_data a, minmax b
436432
WHERE a.location = b.location
437433
AND a.timestamp BETWEEN b.mintstamp AND b.maxtstamp
438-
ORDER BY 1;
434+
ORDER BY a.timestamp;
439435
""",
440436
"Bridge the Gaps per City",
437+
params=(city,),
441438
)
442439

443440
@task(1)
@@ -461,11 +458,9 @@ class QuickstartUser(CrateDBUser):
461458

462459
```
463460

464-
Note that the weight (of query01 and query04) is five compared to the rest, which has a weight of 1, which means that the likelihood that two queries will execute is five times higher than the others. This shows how you can influence the weight of the different queries.
465-
466-
Let’s run this load test and see what happens.
461+
Queries 01 and 04 have weight 5; Locust schedules them ~5× as often as the others (weight 1). Use weights to shape your query mix.
467462

468-
I started the run with 100 users.
463+
Let’s run this load test and see what happens. The following run was started with 100 users.
469464

470465
![statistics-100users|690x206](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/a/aa31288ac528c7eaf3dec7657cc73bbac0bbf7b7.png)
471466

@@ -482,3 +477,12 @@ If you want to download the locust data, you can do that on the last tab.
482477
## Conclusion
483478

484479
When you want to run a load test against a CrateDB Cluster with multiple queries, Locust is a great and flexible tool that lets you quickly define a load test and see what numbers regarding users and RPS are possible for that particular setup.
480+
481+
482+
[CrateDB CLI tools]: https://cratedb.com/docs/crate/clients-tools/en/latest/connect/cli.html#cli
483+
[DBeaver]: https://dbeaver.io
484+
[fully-managed]: https://console.cratedb.cloud/
485+
[Locust]: https://locust.io
486+
[nodeIngestBench]: https://github.com/proddata/nodeIngestBench
487+
[TimeIt]: https://github.com/mfussenegger/cr8#timeit
488+
[tutorial]: https://cratedb.com/docs/crate/tutorials/en/latest/containers/docker.html

0 commit comments

Comments
 (0)