Add Presto benchmark test suite by paul-aiyedun · Pull Request #56 · rapidsai/velox-testing

paul-aiyedun · 2025-09-23T16:37:47Z

Move files that are shared between the integration test suite and performance benchmark test suite into a common parent directory.
Add customized pytest console reporting for benchmarks.
Add support for JSON benchmark result generation.
Update the Presto Benchmarking section of the README.

* Move files that are shared between the integration test suite and performance benchmark test suite into a common parent directory. * Add customized pytest console reporting for benchmarks. * Add support for JSON benchmark result generation. * Update the Presto Benchmarking section of the README.

presto/testing/common/fixtures.py

presto/scripts/run_benchmark.sh

misiugodfrey · 2025-09-24T22:31:50Z

presto/testing/performance_benchmarks/common_fixtures.py

+    raw_times_dict = benchmark_dict[BenchmarkKeys.RAW_TIMES_KEY]
+    assert raw_times_dict == {}
+
+    failed_queries_dict = benchmark_dict[BenchmarkKeys.FAILED_QUERIES_KEY]


While the failed queries are being identified and recorded in the json file, pytest still reports those queries as having passed. I think we should report them as failed tests if they did not go through.

./run_benchmark.sh -b tpch -s sf1 -q 1 1 passed in 0.05s cat benchmark_output/benchmark_result.json { "tpch": { "agg_times_ms": { "avg": {}, "min": {}, "max": {} }, "failed_queries": { "Q1": "USER_ERROR: SYNTAX_ERROR" } } }

This should be fixed in the current PR revision.

misiugodfrey

Generally looks good to me. I was able to get everything running smoothly (once I fixed some unrelated bugs in our scripts - which I will address separately).

At time of writing I believe we are still missing the ability to run warmup queries compared to the other scripts that are in use.

By my current comparison, using the python interface is slightly faster than using curl, but we do lose some metadata. Do we have any interest in collecting these other pieces like cpu_time?

paul-aiyedun · 2025-09-24T23:46:34Z

Generally looks good to me. I was able to get everything running smoothly (once I fixed some unrelated bugs in our scripts - which I will address separately).

At time of writing I believe we are still missing the ability to run warmup queries compared to the other scripts that are in use.

By my current comparison, using the python interface is slightly faster than using curl, but we do lose some metadata. Do we have any interest in collecting these other pieces like cpu_time?

At time of writing I believe we are still missing the ability to run warmup queries compared to the other scripts that are in use.

Can you please expand on this requirement? The current benchmark suite would run the queries multiple times, and users can choose to use the min time if colder queries are to be ignored.

By my current comparison, using the python interface is slightly faster than using curl, but we do lose some metadata. Do we have any interest in collecting these other pieces like cpu_time?

I believe the python client has the same metadata, so we can add this if needed. Is any other data besides the elapsed time being used?

misiugodfrey · 2025-09-25T17:30:40Z

Can you please expand on this requirement? The current benchmark suite would run the queries multiple times, and users can choose to use the min time if colder queries are to be ignored.

We clarified this with Todd in the standup, so I think the current form meets the current expectations now. If there are other tweaks we need we can always add them later.

I believe the python client has the same metadata, so we can add this if needed. Is any other data besides the elapsed time being used?

I think elapsed time is the only one we really care about. If there are others we can add them in later. I have some slight concerns that Presto does not seem to be consistent with what "elapsed time" means when requested via different APIs (a request via curl will yield a different number than via the cli, despite both being recorded in presto metadata as "elapsed_time").

misiugodfrey

LGTM. I think this works and it's better to get it in so that folks can start using it and any further features can be added later.

That being said, one potential recommendation I have is to include all requested queries in the output regardless of if they fail or not. Right now I see:

 Query ID | Avg(ms)  | Min(ms)  | Max(ms)  
-------------------------------------------
    Q1    |  1043.4  |   1014   |   1060   
    Q2    |  1080.2  |   1056   |   1110   
    Q3    |  1841.4  |   1771   |   1905   
    Q4    |   629    |   612    |   654    
    Q5    |  3627.2  |   3462   |   3919   
    Q6    |  175.6   |   163    |   188    
    Q7    |  1880.4  |   1754   |   1979   
    Q8    |  5455.2  |   5324   |   5707   
    Q9    |  5066.6  |   5025   |   5109   
   Q10    |  1846.2  |   1797   |   1922   
   Q11    |   601    |   552    |   655    
   Q12    |  783.2   |   700    |   941    
   Q13    |   738    |   724    |   757    
   Q14    |  363.6   |   344    |   379    
   Q15    |  442.4   |   416    |   479    
   Q16    |   524    |   513    |   539    
   Q17    |  4779.2  |   4744   |   4801   
   Q18    |  5679.4  |   5614   |   5799   
   Q19    |   527    |   499    |   554    
   Q20    |  1045.6  |   1016   |   1073   
   Q22    |   470    |   453    |   508    

======================================================================
FAILED ../testing/performance_benchmarks/tpch_test.py::test_query[Q21] - prestodb.exceptions.PrestoQueryError: PrestoQueryError(type=INTERNAL_ERROR, name=GENERIC_INTERNAL_ERROR, message=" Operator::getOutput failed for [operator: CudfHashJoinProbe, plan node ID: 572]: std::bad_alloc: out_of_memory: CUDA error (failed to allocate 608240624 b
ytes) at: /presto_native_release_gpu_ON_build/_deps/rmm-src/cpp/incl...
1 failed, 21 passed in 198.08s (0:03:18)

Whereas I would prefer to see Q21 listed in the output but with some kind of null or empty values so that it is not missed:

   Q20    |  1045.6  |   1016   |   1073
   Q21    |  null  |   null   |   null
   Q22    |   470    |   453    |   508

mattgara · 2025-09-25T23:33:54Z

Testing this PR, I ran into some practical issues that may be worth addressing by putting in guardrails or better documentation. I list them in detail below.

FAILED ../testing/performance_benchmarks/tpch_test.py::test_query[Q2] - prestodb.exceptions.PrestoUserError: PrestoUserError(type=USER_ERROR, name=SYNTAX_ERROR, message="Schema bench_sf100 does not exist", query_id=20250925_200313_00001_d98iy)
2 failed in 0.84s

I tried to run on a dataset that already existed, but was not setup with a schema. Reading the README.md I realize now this is more user-error, though the error message could be more friendly (for example check for the schema if it doesn't exist point the user in the right way.)

Traceback (most recent call last):
  File "/home/nfs/mgara/velox-testing-base-4/velox-testing/presto/testing/integration_tests/create_hive_tables.py", line 61, in <module>
    create_tables(cursor, args.schema_name, args.schemas_dir_path, data_sub_directory)
  File "/home/nfs/mgara/velox-testing-base-4/velox-testing/presto/testing/integration_tests/create_hive_tables.py", line 26, in create_tables
    presto_cursor.execute(
  File "/home/nfs/mgara/velox-testing-base-4/velox-testing/presto/scripts/.venv/lib/python3.12/site-packages/prestodb/dbapi.py", line 267, in execute
    self._iterator = iter(self._query.execute())
                          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/nfs/mgara/velox-testing-base-4/velox-testing/presto/scripts/.venv/lib/python3.12/site-packages/prestodb/client.py", line 550, in execute
    self._result._rows += self.fetch()
                          ^^^^^^^^^^^^
  File "/home/nfs/mgara/velox-testing-base-4/velox-testing/presto/scripts/.venv/lib/python3.12/site-packages/prestodb/client.py", line 560, in fetch
    status = self._request.process(response)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nfs/mgara/velox-testing-base-4/velox-testing/presto/scripts/.venv/lib/python3.12/site-packages/prestodb/client.py", line 409, in process
    raise self._process_error(response["error"], response.get("id"))
prestodb.exceptions.PrestoUserError: PrestoUserError(type=USER_ERROR, name=INVALID_TABLE_PROPERTY, message="External location must be a directory", query_id=20250925_222100_00004_fcy8v)
Deactivating conda environment
Deleting .venv directory

Running the data generation script I ran into this error. The solution is to ensure that PRESTO_DATA_DIR is defined and exported before launching the Presto server. This one would be hard for me to figure out without an additional pointer (in my case I simply asked, but another user might get hooked up on this.) Presumably, the benchmark script will also not work if PRESTO_DATA_DIR is not defined for the running server?

=========================================================================================================================== short test summary info ============================================================================================================================
FAILED ../testing/performance_benchmarks/tpch_test.py::test_query[Q2] - prestodb.exceptions.PrestoQueryError: PrestoQueryError(type=INSUFFICIENT_RESOURCES, name=EXCEEDED_GLOBAL_MEMORY_LIMIT, message="Query exceeded distributed user memory limit of 2GB", query_id=20250925_225707_00019_eek7f)
1 failed, 1 passed in 39.07s

This error appears to kick in when testing anything relatively large with the default config, which limits the server to 2GB. In my case I tested SF100. The solution is to increase the memory limits (manually) for the Presto server and restart.

========================================================================================= FAILURES ==========================================================================================
______________________________________________________________________________________ test_query[Q2] _______________________________________________________________________________________

benchmark_query = <function benchmark_query.<locals>.benchmark_query_function at 0xeb5fd55fc360>, tpch_query_id = 'Q2'

    def test_query(benchmark_query, tpch_query_id):
>       benchmark_query(tpch_query_id)

../testing/performance_benchmarks/tpch_test.py:21: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../testing/performance_benchmarks/common_fixtures.py:69: in benchmark_query_function
    presto_cursor.execute(benchmark_queries[query_id]).stats["elapsedTimeMillis"]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.12/site-packages/prestodb/dbapi.py:267: in execute
    self._iterator = iter(self._query.execute())
                          ^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.12/site-packages/prestodb/client.py:550: in execute
    self._result._rows += self.fetch()
                          ^^^^^^^^^^^^
.venv/lib/python3.12/site-packages/prestodb/client.py:560: in fetch
    status = self._request.process(response)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <prestodb.client.PrestoRequest object at 0xeb5fd55d9880>, http_response = <Response [200]>

    def process(self, http_response):
        # type: (requests.Response) -> PrestoStatus
        if not http_response.ok:
            self.raise_response_error(http_response)
    
        http_response.encoding = "utf-8"
        response = http_response.json()
        if "error" in response:
>           raise self._process_error(response["error"], response.get("id"))
E           prestodb.exceptions.PrestoQueryError: PrestoQueryError(type=INTERNAL_ERROR, name=GENERIC_INTERNAL_ERROR, message=" Failed to fetch data from 172.18.0.4:8080 /v1/task/20250925_232435_00005_7s7pz.11.0.0.0/results/0/7530 - Exhausted after 25 retries, duration 185399ms: Exception: VeloxRuntimeError
E           Error Source: RUNTIME
E           Error Code: MEM_CAP_EXCEEDED
E           Reason: Exceeded memory pool capacity.
E           Retriable: True
E           Function: growCapacity
E           File: /presto_native_staging/presto/velox/velox/common/memory/MemoryArbitrator.cpp
E           Line: 108
E           Stack trace:
E           # 0  _ZN8facebook5velox7process10StackTraceC1Ei
E           # 1  _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_
E           # 2  _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRKNS1_18VeloxCheckFailArgsET0_
E           # 3  _ZN8facebook5velox6memory12_GLOBAL__N_114NoopArbitrator12growCapacityEPNS1_10MemoryPoolEm
E           # 4  _ZN8facebook5velox6memory14MemoryPoolImpl12growCapacityEPNS1_10MemoryPoolEm
E           # 5  _ZN8facebook5velox6memory14MemoryPoolImpl30incrementReservationThreadSafeEPNS1_10MemoryPoolEm
E           # 6  _ZN8facebook5velox6memory14MemoryPoolImpl30incrementReservationThreadSafeEPNS1_10MemoryPoolEm
E           # 7  _ZN8facebook5velox6memory14MemoryPoolImpl30incrementReservationThreadSafeEPNS1_10MemoryPoolEm
E           # 8  _ZN8facebook5velox6memory14MemoryPoolImpl30incrementReservationThreadSafeEPNS1_10MemoryPoolEm
E           # 9  _ZN8facebook5velox6memory14MemoryPoolImpl17reserveThreadSafeEmb
E           # 10 _ZN8facebook5velox6memory14MemoryPoolImpl8allocateElSt8optionalIjE
E           # 11 _ZN8facebook6presto4http12HttpResponse14appendWithCopyEOSt10unique_ptrIN5folly5IOBufESt14default_deleteIS5_EE
E           # 12 _ZN8proxygen22HTTPTransactionHandler16onBodyWithOffsetEmSt10unique_ptrIN5folly5IOBufESt14default_deleteIS3_EE
E           # 13 _ZN8proxygen15HTTPTransaction18processIngressBodyESt10unique_ptrIN5folly5IOBufESt14default_deleteIS3_EEm
E           # 14 _ZN8proxygen15HTTPTransaction13onIngressBodyESt10unique_ptrIN5folly5IOBufESt14default_deleteIS3_EEt
E           # 15 _ZN8proxygen15HTTPSessionBase10onBodyImplESt10unique_ptrIN5folly5IOBufESt14default_deleteIS3_EEmtPNS_15HTTPTransactionE
E           # 16 _ZN8proxygen11HTTPSession6onBodyEmSt10unique_ptrIN5folly5IOBufESt14default_deleteIS3_EEt
E           # 17 _ZN8proxygen26PassThroughHTTPCodecFilter6onBodyEmSt10unique_ptrIN5folly5IOBufESt14default_deleteIS3_EEt
E           # 18 _ZN8proxygen11HTTP1xCodec6onBodyEPKcm
E           # 19 _ZN8proxygen11HTTP1xCodec8onBodyCBEPNS_11http_parserEPKcm
E           # 20 _ZN8proxygen27http_parser_execute_optionsEPNS_11http_parserEPKNS_20http_parser_settingsEhPKcm
E           # 21 _ZN8proxygen11HTTP1xCodec13onIngressImplERKN5folly5IOBufE
E           # 22 _ZN8proxygen11HTTP1xCodec9onIngressERKN5folly5IOBufE
E           # 23 _ZN8proxygen26PassThroughHTTPCodecFilter9onIngressERKN5folly5IOBufE
E           # 24 _ZN8proxygen11HTTPSession15processReadDataEv
E           # 25 _ZN8proxygen11HTTPSession17readDataAvailableEm
E           # 26 _ZN5folly11AsyncSocket17processNormalReadEv
E           # 27 _ZN5folly11AsyncSocket10handleReadEv
E           # 28 _ZN5folly11AsyncSocket7ioReadyEt
E           # 29 _ZN5folly11AsyncSocket9IoHandler12handlerReadyEt
E           # 30 _ZN5folly12EventHandler16libeventCallbackEisPv
E           # 31 0x00000000000230fc
E           # 32 event_base_loop
E           # 33 _ZN12_GLOBAL__N_116EventBaseBackend18eb_event_base_loopEi
E           # 34 _ZN5folly9EventBase8loopMainEiNS0_11LoopOptionsE
E           # 35 _ZN5folly9EventBase8loopBodyEiNS0_11LoopOptionsE
E           # 36 _ZN5folly9EventBase4loopEv
E           # 37 _ZN5folly9EventBase11loopForeverEv
E           # 38 _ZN5folly20IOThreadPoolExecutor9threadRunESt10shared_ptrINS_18ThreadPoolExecutor6ThreadEE
E           # 39 _ZSt13__invoke_implIvRMN5folly18ThreadPoolExecutorEFvSt10shared_ptrINS1_6ThreadEEERPS1_JRS4_EET_St21__invoke_memfun_derefOT0_OT1_DpOT2_
E           # 40 _ZSt8__invokeIRMN5folly18ThreadPoolExecutorEFvSt10shared_ptrINS1_6ThreadEEEJRPS1_RS4_EENSt15__invoke_resultIT_JDpT0_EE4typeEOSC_DpOSD_
E           # 41 _ZNSt5_BindIFMN5folly18ThreadPoolExecutorEFvSt10shared_ptrINS1_6ThreadEEEPS1_S4_EE6__callIvJEJLm0ELm1EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE
E           # 42 _ZNSt5_BindIFMN5folly18ThreadPoolExecutorEFvSt10shared_ptrINS1_6ThreadEEEPS1_S4_EEclIJEvEET0_DpOT_
E           # 43 _ZN5folly6detail8function5call_ISt5_BindIFMNS_18ThreadPoolExecutorEFvSt10shared_ptrINS4_6ThreadEEEPS4_S7_EELb1ELb0EvJEEET2_DpT3_RNS1_4DataE
E           # 44 0x00000000000d28fc
E           # 45 start_thread
E           # 46 thread_start
E            Operator: Exchange[2287] 0", query_id=20250925_232435_00005_7s7pz)

.venv/lib/python3.12/site-packages/prestodb/client.py:409: PrestoQueryError

---------------------------------------------------------------------------------- tpch Benchmark Summary -----------------------------------------------------------------------------------
 Query ID | Avg(ms)  | Min(ms)  | Max(ms)  
-------------------------------------------
    Q1    |  5206.4  |   4251   |   9005   

================================================================================== short test summary info ==================================================================================
FAILED ../testing/performance_benchmarks/tpch_test.py::test_query[Q2] - prestodb.exceptions.PrestoQueryError: PrestoQueryError(type=INTERNAL_ERROR, name=GENERIC_INTERNAL_ERROR, message=" Failed to fetch data from 172.18.0.4:8080 /v1/task/20250925_232435_00...
1 failed, 1 passed in 217.88s (0:03:37)

This error occurs after bumping the memory limits by a factor of 10 (from 2GB to 20GB). It appears we still hit memory related issues.

As a side note, not related to this PR, increasing the memory limits to something large (like 200GB) on a server seems to fail silently and not start a worker node with the error:

Waiting for a worker node to be registered...                                                                                                                                                                                                   
Coordinator URL: http://localhost:8080                                                                                  
Error: Worker node not registered

Presumably some error is generated somewhere, but the benchmark script nor the server (docker) process seem to have any logs.

mattgara

Overall looks good to me.

Most of my comments are open-ended and have to do with discussing quality-of-life changes/improvements.

mattgara · 2025-09-25T23:42:26Z

presto/testing/performance_benchmarks/common_fixtures.py

+from ..common.fixtures import tpch_queries, tpcds_queries
+
+
+@pytest.fixture(scope="module")


We might want to consider the scope here. A scope of function may be more appropriate for benchmarking as it would create a new connection for each query being executed.

Hmm, why would we want to create a new connection for each query?

I'm not clear if there is a delay/cost to running the first query on a new connection, or if there is state that is carried over between queries run on the same connection or cursor (tied to that "session".) If so, using one approach or the other would have different benchmark timings.

Looking into this to see how other database benchmarks approach this, it looks like pgbench uses one connection per transaction while benchmarking (see here: https://github.com/postgres/postgres/blob/master/src/bin/pgbench/pgbench.c#L267.)

Reading a little more up on this, the transaction is the logical unit pgbench wants to benchmark at (see builtin TPC-B like benchmark here: https://github.com/postgres/postgres/blob/66cdef4425f3295a2cfbce3031b3c0f0b5dffc04/src/bin/pgbench/pgbench.c#L782)

If it's reasonable to follow Postgres' pattern here, we'd probably want to focus on what we want the logic unit to be that we are benchmarking, individual queries or all queries in the benchmark (for example TPC-H.) If it's the former we probably want a new connection or cursor per query and the latter I think we are fine reusing.

What's the typical logical unit we want to benchmark?

I believe when it comes to PostgreSQL, client/server connections are stateful, so having to reset the connection for each query makes sense. In this case, client/server interactions happen over HTTP, which is a stateless protocol, and connection times are not included in the benchmark report.

In this case, client/server interactions happen over HTTP, which is a stateless protocol, and connection times are not included in the benchmark report.

Ahh okay, in that case my premise that some state can be carried over is not valid anymore, thanks for the clarification.

mattgara · 2025-09-25T23:47:16Z

presto/testing/performance_benchmarks/conftest.py

+    parser.addoption("--tag")
+
+
+def pytest_terminal_summary(terminalreporter, exitstatus, config):


For the summary report consider,

Separating the first query in each set of queries from computing averages or any statistics and reporting them separately.

Reporting the arithmetic and geometric mean (the geometric mean is more robust to outliers.)

Since we report min and max, median would be good to compute.

Separating the first query in each set of queries from computing averages or any statistics and reporting them separately.

I don't think this works with the "lukewarm" benchmark requirement/use case?

Reporting the arithmetic and geometric mean (the geometric mean is more robust to outliers.)

Since we report min and max, median would be good to compute.

Geometric mean and median have been added.

I don't think this works with the "lukewarm" benchmark requirement/use case?

Hmm right. I wonder if it would be beneficial to have both sets of behaviour implemented for the reporting that can be toggled with a switch?

Alternatively, we can leave it as is, and come back and implement it if there is a user requesting this feature.

paul-aiyedun · 2025-09-26T02:08:59Z

LGTM. I think this works and it's better to get it in so that folks can start using it and any further features can be added later.

That being said, one potential recommendation I have is to include all requested queries in the output regardless of if they fail or not. Right now I see:

 Query ID | Avg(ms)  | Min(ms)  | Max(ms)  
-------------------------------------------
    Q1    |  1043.4  |   1014   |   1060   
    Q2    |  1080.2  |   1056   |   1110   
    Q3    |  1841.4  |   1771   |   1905   
    Q4    |   629    |   612    |   654    
    Q5    |  3627.2  |   3462   |   3919   
    Q6    |  175.6   |   163    |   188    
    Q7    |  1880.4  |   1754   |   1979   
    Q8    |  5455.2  |   5324   |   5707   
    Q9    |  5066.6  |   5025   |   5109   
   Q10    |  1846.2  |   1797   |   1922   
   Q11    |   601    |   552    |   655    
   Q12    |  783.2   |   700    |   941    
   Q13    |   738    |   724    |   757    
   Q14    |  363.6   |   344    |   379    
   Q15    |  442.4   |   416    |   479    
   Q16    |   524    |   513    |   539    
   Q17    |  4779.2  |   4744   |   4801   
   Q18    |  5679.4  |   5614   |   5799   
   Q19    |   527    |   499    |   554    
   Q20    |  1045.6  |   1016   |   1073   
   Q22    |   470    |   453    |   508    

======================================================================
FAILED ../testing/performance_benchmarks/tpch_test.py::test_query[Q21] - prestodb.exceptions.PrestoQueryError: PrestoQueryError(type=INTERNAL_ERROR, name=GENERIC_INTERNAL_ERROR, message=" Operator::getOutput failed for [operator: CudfHashJoinProbe, plan node ID: 572]: std::bad_alloc: out_of_memory: CUDA error (failed to allocate 608240624 b
ytes) at: /presto_native_release_gpu_ON_build/_deps/rmm-src/cpp/incl...
1 failed, 21 passed in 198.08s (0:03:18)

Whereas I would prefer to see Q21 listed in the output but with some kind of null or empty values so that it is not missed:

   Q20    |  1045.6  |   1016   |   1073
   Q21    |  null  |   null   |   null
   Q22    |   470    |   453    |   508

The report has been updated to include failed queries.

paul-aiyedun · 2025-09-26T02:15:51Z

FAILED ../testing/performance_benchmarks/tpch_test.py::test_query[Q2] - prestodb.exceptions.PrestoUserError: PrestoUserError(type=USER_ERROR, name=SYNTAX_ERROR, message="Schema bench_sf100 does not exist", query_id=20250925_200313_00001_d98iy)
2 failed in 0.84s
I tried to run on a dataset that already existed, but was not setup with a schema. Reading the README.md I realize now this is more user-error, though the error message could be more friendly (for example check for the schema if it doesn't exist point the user in the right way.)

I have updated the run_benchmark.sh help text to indicate the need for an already existing schema.

for example check for the schema if it doesn't exist point the user in the right way

Can you please expand on this? I believe this check is already done by Presto.

paul-aiyedun · 2025-09-26T02:20:55Z

Traceback (most recent call last):
  File "/home/nfs/mgara/velox-testing-base-4/velox-testing/presto/testing/integration_tests/create_hive_tables.py", line 61, in <module>
    create_tables(cursor, args.schema_name, args.schemas_dir_path, data_sub_directory)
  File "/home/nfs/mgara/velox-testing-base-4/velox-testing/presto/testing/integration_tests/create_hive_tables.py", line 26, in create_tables
    presto_cursor.execute(
  File "/home/nfs/mgara/velox-testing-base-4/velox-testing/presto/scripts/.venv/lib/python3.12/site-packages/prestodb/dbapi.py", line 267, in execute
    self._iterator = iter(self._query.execute())
                          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/nfs/mgara/velox-testing-base-4/velox-testing/presto/scripts/.venv/lib/python3.12/site-packages/prestodb/client.py", line 550, in execute
    self._result._rows += self.fetch()
                          ^^^^^^^^^^^^
  File "/home/nfs/mgara/velox-testing-base-4/velox-testing/presto/scripts/.venv/lib/python3.12/site-packages/prestodb/client.py", line 560, in fetch
    status = self._request.process(response)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nfs/mgara/velox-testing-base-4/velox-testing/presto/scripts/.venv/lib/python3.12/site-packages/prestodb/client.py", line 409, in process
    raise self._process_error(response["error"], response.get("id"))
prestodb.exceptions.PrestoUserError: PrestoUserError(type=USER_ERROR, name=INVALID_TABLE_PROPERTY, message="External location must be a directory", query_id=20250925_222100_00004_fcy8v)
Deactivating conda environment
Deleting .venv directory

Running the data generation script I ran into this error. The solution is to ensure that PRESTO_DATA_DIR is defined and exported before launching the Presto server. This one would be hard for me to figure out without an additional pointer (in my case I simply asked, but another user might get hooked up on this.) Presumably, the benchmark script will also not work if PRESTO_DATA_DIR is not defined for the running server?

The benchmark script requires an existing and valid schema but has no direct dependency on the PRESTO_DATA_DIR environment variable.

paul-aiyedun · 2025-09-26T02:24:06Z

=========================================================================================================================== short test summary info ============================================================================================================================
FAILED ../testing/performance_benchmarks/tpch_test.py::test_query[Q2] - prestodb.exceptions.PrestoQueryError: PrestoQueryError(type=INSUFFICIENT_RESOURCES, name=EXCEEDED_GLOBAL_MEMORY_LIMIT, message="Query exceeded distributed user memory limit of 2GB", query_id=20250925_225707_00019_eek7f)
1 failed, 1 passed in 39.07s

This error appears to kick in when testing anything relatively large with the default config, which limits the server to 2GB. In my case I tested SF100. The solution is to increase the memory limits (manually) for the Presto server and restart.

========================================================================================= FAILURES ==========================================================================================
______________________________________________________________________________________ test_query[Q2] _______________________________________________________________________________________

benchmark_query = <function benchmark_query.<locals>.benchmark_query_function at 0xeb5fd55fc360>, tpch_query_id = 'Q2'

    def test_query(benchmark_query, tpch_query_id):
>       benchmark_query(tpch_query_id)

../testing/performance_benchmarks/tpch_test.py:21: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../testing/performance_benchmarks/common_fixtures.py:69: in benchmark_query_function
    presto_cursor.execute(benchmark_queries[query_id]).stats["elapsedTimeMillis"]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.12/site-packages/prestodb/dbapi.py:267: in execute
    self._iterator = iter(self._query.execute())
                          ^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.12/site-packages/prestodb/client.py:550: in execute
    self._result._rows += self.fetch()
                          ^^^^^^^^^^^^
.venv/lib/python3.12/site-packages/prestodb/client.py:560: in fetch
    status = self._request.process(response)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <prestodb.client.PrestoRequest object at 0xeb5fd55d9880>, http_response = <Response [200]>

    def process(self, http_response):
        # type: (requests.Response) -> PrestoStatus
        if not http_response.ok:
            self.raise_response_error(http_response)
    
        http_response.encoding = "utf-8"
        response = http_response.json()
        if "error" in response:
>           raise self._process_error(response["error"], response.get("id"))
E           prestodb.exceptions.PrestoQueryError: PrestoQueryError(type=INTERNAL_ERROR, name=GENERIC_INTERNAL_ERROR, message=" Failed to fetch data from 172.18.0.4:8080 /v1/task/20250925_232435_00005_7s7pz.11.0.0.0/results/0/7530 - Exhausted after 25 retries, duration 185399ms: Exception: VeloxRuntimeError
E           Error Source: RUNTIME
E           Error Code: MEM_CAP_EXCEEDED
E           Reason: Exceeded memory pool capacity.
E           Retriable: True
E           Function: growCapacity
E           File: /presto_native_staging/presto/velox/velox/common/memory/MemoryArbitrator.cpp
E           Line: 108
E           Stack trace:
E           # 0  _ZN8facebook5velox7process10StackTraceC1Ei
E           # 1  _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_
E           # 2  _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRKNS1_18VeloxCheckFailArgsET0_
E           # 3  _ZN8facebook5velox6memory12_GLOBAL__N_114NoopArbitrator12growCapacityEPNS1_10MemoryPoolEm
E           # 4  _ZN8facebook5velox6memory14MemoryPoolImpl12growCapacityEPNS1_10MemoryPoolEm
E           # 5  _ZN8facebook5velox6memory14MemoryPoolImpl30incrementReservationThreadSafeEPNS1_10MemoryPoolEm
E           # 6  _ZN8facebook5velox6memory14MemoryPoolImpl30incrementReservationThreadSafeEPNS1_10MemoryPoolEm
E           # 7  _ZN8facebook5velox6memory14MemoryPoolImpl30incrementReservationThreadSafeEPNS1_10MemoryPoolEm
E           # 8  _ZN8facebook5velox6memory14MemoryPoolImpl30incrementReservationThreadSafeEPNS1_10MemoryPoolEm
E           # 9  _ZN8facebook5velox6memory14MemoryPoolImpl17reserveThreadSafeEmb
E           # 10 _ZN8facebook5velox6memory14MemoryPoolImpl8allocateElSt8optionalIjE
E           # 11 _ZN8facebook6presto4http12HttpResponse14appendWithCopyEOSt10unique_ptrIN5folly5IOBufESt14default_deleteIS5_EE
E           # 12 _ZN8proxygen22HTTPTransactionHandler16onBodyWithOffsetEmSt10unique_ptrIN5folly5IOBufESt14default_deleteIS3_EE
E           # 13 _ZN8proxygen15HTTPTransaction18processIngressBodyESt10unique_ptrIN5folly5IOBufESt14default_deleteIS3_EEm
E           # 14 _ZN8proxygen15HTTPTransaction13onIngressBodyESt10unique_ptrIN5folly5IOBufESt14default_deleteIS3_EEt
E           # 15 _ZN8proxygen15HTTPSessionBase10onBodyImplESt10unique_ptrIN5folly5IOBufESt14default_deleteIS3_EEmtPNS_15HTTPTransactionE
E           # 16 _ZN8proxygen11HTTPSession6onBodyEmSt10unique_ptrIN5folly5IOBufESt14default_deleteIS3_EEt
E           # 17 _ZN8proxygen26PassThroughHTTPCodecFilter6onBodyEmSt10unique_ptrIN5folly5IOBufESt14default_deleteIS3_EEt
E           # 18 _ZN8proxygen11HTTP1xCodec6onBodyEPKcm
E           # 19 _ZN8proxygen11HTTP1xCodec8onBodyCBEPNS_11http_parserEPKcm
E           # 20 _ZN8proxygen27http_parser_execute_optionsEPNS_11http_parserEPKNS_20http_parser_settingsEhPKcm
E           # 21 _ZN8proxygen11HTTP1xCodec13onIngressImplERKN5folly5IOBufE
E           # 22 _ZN8proxygen11HTTP1xCodec9onIngressERKN5folly5IOBufE
E           # 23 _ZN8proxygen26PassThroughHTTPCodecFilter9onIngressERKN5folly5IOBufE
E           # 24 _ZN8proxygen11HTTPSession15processReadDataEv
E           # 25 _ZN8proxygen11HTTPSession17readDataAvailableEm
E           # 26 _ZN5folly11AsyncSocket17processNormalReadEv
E           # 27 _ZN5folly11AsyncSocket10handleReadEv
E           # 28 _ZN5folly11AsyncSocket7ioReadyEt
E           # 29 _ZN5folly11AsyncSocket9IoHandler12handlerReadyEt
E           # 30 _ZN5folly12EventHandler16libeventCallbackEisPv
E           # 31 0x00000000000230fc
E           # 32 event_base_loop
E           # 33 _ZN12_GLOBAL__N_116EventBaseBackend18eb_event_base_loopEi
E           # 34 _ZN5folly9EventBase8loopMainEiNS0_11LoopOptionsE
E           # 35 _ZN5folly9EventBase8loopBodyEiNS0_11LoopOptionsE
E           # 36 _ZN5folly9EventBase4loopEv
E           # 37 _ZN5folly9EventBase11loopForeverEv
E           # 38 _ZN5folly20IOThreadPoolExecutor9threadRunESt10shared_ptrINS_18ThreadPoolExecutor6ThreadEE
E           # 39 _ZSt13__invoke_implIvRMN5folly18ThreadPoolExecutorEFvSt10shared_ptrINS1_6ThreadEEERPS1_JRS4_EET_St21__invoke_memfun_derefOT0_OT1_DpOT2_
E           # 40 _ZSt8__invokeIRMN5folly18ThreadPoolExecutorEFvSt10shared_ptrINS1_6ThreadEEEJRPS1_RS4_EENSt15__invoke_resultIT_JDpT0_EE4typeEOSC_DpOSD_
E           # 41 _ZNSt5_BindIFMN5folly18ThreadPoolExecutorEFvSt10shared_ptrINS1_6ThreadEEEPS1_S4_EE6__callIvJEJLm0ELm1EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE
E           # 42 _ZNSt5_BindIFMN5folly18ThreadPoolExecutorEFvSt10shared_ptrINS1_6ThreadEEEPS1_S4_EEclIJEvEET0_DpOT_
E           # 43 _ZN5folly6detail8function5call_ISt5_BindIFMNS_18ThreadPoolExecutorEFvSt10shared_ptrINS4_6ThreadEEEPS4_S7_EELb1ELb0EvJEEET2_DpT3_RNS1_4DataE
E           # 44 0x00000000000d28fc
E           # 45 start_thread
E           # 46 thread_start
E            Operator: Exchange[2287] 0", query_id=20250925_232435_00005_7s7pz)

.venv/lib/python3.12/site-packages/prestodb/client.py:409: PrestoQueryError

---------------------------------------------------------------------------------- tpch Benchmark Summary -----------------------------------------------------------------------------------
 Query ID | Avg(ms)  | Min(ms)  | Max(ms)  
-------------------------------------------
    Q1    |  5206.4  |   4251   |   9005   

================================================================================== short test summary info ==================================================================================
FAILED ../testing/performance_benchmarks/tpch_test.py::test_query[Q2] - prestodb.exceptions.PrestoQueryError: PrestoQueryError(type=INTERNAL_ERROR, name=GENERIC_INTERNAL_ERROR, message=" Failed to fetch data from 172.18.0.4:8080 /v1/task/20250925_232435_00...
1 failed, 1 passed in 217.88s (0:03:37)

This error occurs after bumping the memory limits by a factor of 10 (from 2GB to 20GB). It appears we still hit memory related issues.

As a side note, not related to this PR, increasing the memory limits to something large (like 200GB) on a server seems to fail silently and not start a worker node with the error:

Waiting for a worker node to be registered...                                                                                                                                                                                                   
Coordinator URL: http://localhost:8080                                                                                  
Error: Worker node not registered

Presumably some error is generated somewhere, but the benchmark script nor the server (docker) process seem to have any logs.

Feel free to add configuration related comments to this PR: #48

mattgara · 2025-09-26T18:01:30Z

Can you please expand on this? I believe this check is already done by Presto.

I think that's fine. My thinking was to have more of a user friendly message, but I think the Presto database error is fine for the set of users we expect to be using this tool.

mattgara · 2025-09-26T18:02:28Z

The benchmark script requires an existing and valid schema but has no direct dependency on the PRESTO_DATA_DIR environment variable.

Okay, could you please point me at the documentation that describes we should have PRESTO_DATA_DIR set before starting the Presto server?

paul-aiyedun · 2025-09-26T18:39:30Z

The benchmark script requires an existing and valid schema but has no direct dependency on the PRESTO_DATA_DIR environment variable.

Okay, could you please point me at the documentation that describes we should have PRESTO_DATA_DIR set before starting the Presto server?

I believe the setup script help description should contain this detail.

velox-testing/presto/scripts/setup_benchmark_helper_check_instance_and_parse_args.sh

Lines 36 to 37 in 9036375

    
           NOTE: The PRESTO_DATA_DIR environment variable must be set before running this script. This environment variable  
        
           must also be set before starting the Presto instance/running the `start_*_presto.sh` script.

mattgara · 2025-09-26T18:49:38Z

The benchmark script requires an existing and valid schema but has no direct dependency on the PRESTO_DATA_DIR environment variable.

Okay, could you please point me at the documentation that describes we should have PRESTO_DATA_DIR set before starting the Presto server?

I believe the setup script help description should contain this detail.

velox-testing/presto/scripts/setup_benchmark_helper_check_instance_and_parse_args.sh

Lines 36 to 37 in 9036375

NOTE: The PRESTO_DATA_DIR environment variable must be set before running this script. This environment variable

must also be set before starting the Presto instance/running the `start_*_presto.sh` script.

I believe the setup script help description should contain this detail.

Hmm is setup_benchmark_helper_check_instance_and_parse_args.sh a script we expect users to interact with (and call --help on?) I don't believe it is mentioned in the README.

In that case, should this statement be included in a more user-visible place?

paul-aiyedun · 2025-09-26T19:46:06Z

The benchmark script requires an existing and valid schema but has no direct dependency on the PRESTO_DATA_DIR environment variable.

Okay, could you please point me at the documentation that describes we should have PRESTO_DATA_DIR set before starting the Presto server?

I believe the setup script help description should contain this detail.

velox-testing/presto/scripts/setup_benchmark_helper_check_instance_and_parse_args.sh

Lines 36 to 37 in 9036375

NOTE: The PRESTO_DATA_DIR environment variable must be set before running this script. This environment variable

must also be set before starting the Presto instance/running the `start_*_presto.sh` script.

I believe the setup script help description should contain this detail.

Hmm is setup_benchmark_helper_check_instance_and_parse_args.sh a script we expect users to interact with (and call --help on?) I don't believe it is mentioned in the README.

In that case, should this statement be included in a more user-visible place?

This should be in the output of ./setup_benchmark_tables.sh -h and ./setup_benchmark_data_and_tables.sh -h, which are user facing scripts. The above was a code reference.

mattgara

LGTM

paul-aiyedun requested review from Avinash-Raj, mattgara, misiugodfrey and simoneves September 23, 2025 16:37

paul-aiyedun force-pushed the paul/add_benchmark_test_suite branch 3 times, most recently from c674382 to 9ef7df9 Compare September 24, 2025 21:43

paul-aiyedun force-pushed the paul/add_benchmark_test_suite branch from 9ef7df9 to 71d991c Compare September 24, 2025 22:46

misiugodfrey reviewed Sep 24, 2025

View reviewed changes

paul-aiyedun requested a review from misiugodfrey September 25, 2025 15:49

misiugodfrey approved these changes Sep 25, 2025

View reviewed changes

mattgara reviewed Sep 25, 2025

View reviewed changes

Add more statistics and include failed queries in report

1dbf921

paul-aiyedun requested a review from mattgara September 26, 2025 03:16

mattgara approved these changes Sep 26, 2025

View reviewed changes

paul-aiyedun merged commit 52b7ea4 into main Sep 26, 2025

paul-aiyedun deleted the paul/add_benchmark_test_suite branch September 26, 2025 21:54

mattgara mentioned this pull request Oct 3, 2025

Add ANALYZE utility script #69

Merged

misiugodfrey mentioned this pull request Oct 3, 2025

Added benchmarking script #33

Closed

		from ..common.fixtures import tpch_queries, tpcds_queries


		@pytest.fixture(scope="module")

		parser.addoption("--tag")


		def pytest_terminal_summary(terminalreporter, exitstatus, config):

Conversation

paul-aiyedun commented Sep 23, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

misiugodfrey left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

paul-aiyedun commented Sep 24, 2025

Uh oh!

misiugodfrey commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

misiugodfrey left a comment

Choose a reason for hiding this comment

Uh oh!

mattgara commented Sep 25, 2025

Uh oh!

mattgara left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattgara Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

paul-aiyedun commented Sep 26, 2025

Uh oh!

paul-aiyedun commented Sep 26, 2025

Uh oh!

paul-aiyedun commented Sep 26, 2025

Uh oh!

paul-aiyedun commented Sep 26, 2025

Uh oh!

mattgara commented Sep 26, 2025

Uh oh!

mattgara commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paul-aiyedun commented Sep 26, 2025

Uh oh!

mattgara commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paul-aiyedun commented Sep 26, 2025

Uh oh!

mattgara left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

misiugodfrey left a comment •

edited

Loading

misiugodfrey commented Sep 25, 2025 •

edited

Loading

mattgara Sep 26, 2025 •

edited

Loading

mattgara commented Sep 26, 2025 •

edited

Loading

mattgara commented Sep 26, 2025 •

edited

Loading