Skip to content

Added benchmarking script#33

Closed
misiugodfrey wants to merge 28 commits intomainfrom
misiug/Benchmarking
Closed

Added benchmarking script#33
misiugodfrey wants to merge 28 commits intomainfrom
misiug/Benchmarking

Conversation

@misiugodfrey
Copy link
Copy Markdown
Contributor

@misiugodfrey misiugodfrey commented Sep 4, 2025

Add a simple benchmarking script that will run benchmarks based on what is in the testing data directory. Offers optional profiling using nsys.

NOTE: There is ongoing work to perform this kind of benchmarking using python scripts (that will re-use a lot of our integration test work - de-duplicating a lot of code), so this item will likely stand as a placeholder until it can be replaced with the preferred python option once we verify they behave the same.

ENV EXTRA_CMAKE_FLAGS=${EXTRA_CMAKE_FLAGS}
ENV NUM_THREADS=${NUM_THREADS}

RUN rpm --import https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub && dnf config-manager --add-repo "https://developer.download.nvidia.com/devtools/repos/rhel$(source /etc/os-release; echo ${VERSION_ID%%.*})/$(rpm --eval '%{_arch}' | sed s/aarch/arm/)/" && dnf install -y nsight-systems-cli-2025.5.1.121
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Install a pinned version of nsys to the container.

echo "/usr/lib64/presto-native-libs" > /etc/ld.so.conf.d/presto_native.conf

CMD bash -c "ldconfig && presto_server --etc-dir=/opt/presto-server/etc"
CMD bash -c "ldconfig && nsys launch presto_server --etc-dir=/opt/presto-server/etc"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did this work? I was just assuming this would work but @karthikeyann tried it last night and couldn't get it to work until he started the container in interactive mode and manually ran nsys launch presto_server ...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been working for me - at least in so far as it generates profiles. What issue was he running into when he attempted this approach?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of the arguments need to be given during nsys launch itself.
Please add required arguments for nsys here. I got this from our old velox benchmark scripts.

CMD bash -c "ldconfig && nsys launch -t nvtx,cuda,osrt \
        --cuda-memory-usage=true \
        --cuda-um-cpu-page-faults=true \
        --cuda-um-gpu-page-faults=true \
         presto_server --etc-dir=/opt/presto-server/etc"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add --gpu-metrics-devices, --cudabacktrace

@devavret
Copy link
Copy Markdown
Contributor

devavret commented Sep 5, 2025

May I request options to include certain nsys options? Personally, I'm using --gpu-metrics-devices, --cudabacktrace, --cuda-memory-usage. The latter two are nsys launch params and the first is specified at nsys start

@misiugodfrey misiugodfrey marked this pull request as ready for review September 10, 2025 23:48
@misiugodfrey
Copy link
Copy Markdown
Contributor Author

May I request options to include certain nsys options? Personally, I'm using --gpu-metrics-devices, --cudabacktrace, --cuda-memory-usage. The latter two are nsys launch params and the first is specified at nsys start

I've added the launch params, but --gpu-metrics-devices requires the container to be run with SYS_ADMIN capabilities. Not sure if we want to require that as part of the base changes, or if we should make that a custom option.

@tmostak
Copy link
Copy Markdown

tmostak commented Sep 15, 2025

@misiugodfrey thanks for this. I think we should:

  1. Always do a cold run of each query. We may want to capture timing of that separtely. We may want to allow an optional cache flush before each cold run to simulate cold/un-cached reads from disk is requested.
  2. Allow a user-specified number of hot runs, and take the average of those.

@misiugodfrey
Copy link
Copy Markdown
Contributor Author

  1. Always do a cold run of each query. We may want to capture timing of that separtely. We may want to allow an optional cache flush before each cold run to simulate cold/un-cached reads from disk is requested.
  2. Allow a user-specified number of hot runs, and take the average of those.

@tmostak these are good ideas, but I think it would be better to add them as new features in a subsequent PR. I would rather this focus on how we want runs to occur, and then we can add further tuning and hot/cold profiling once we have settled on the basics.

Comment on lines +82 to +83
if echo "$images" | grep -q "presto-native-worker-cpu"; then
[[ -n $WORKER ]] & echo_error "mismatch in worker types" && exit 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it disallow running with cpu or java?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not disallowing those runs (or if it is, that's a bug); it's checking to see if we already have workers of other types running. It's to make sure we aren't running java workers alongside cpu or gpu workers as the profiling expects us to only be running a single worker type (and a single worker for that matter).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@misiugodfrey I think there is an issue with & echo_error, it should be && echo_error.

Copy link
Copy Markdown
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recreated velox-testing

I added the changes in this PR review to make it work on a fress system (arm).

velox-testing/presto/scripts$ ./run_integ_test.sh -b tpch passed (Q15 unstable due to know issue: floating point join key)

ENV NUM_THREADS=${NUM_THREADS}

RUN rpm --import https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub && dnf config-manager --add-repo "https://developer.download.nvidia.com/devtools/repos/rhel$(source /etc/os-release; echo ${VERSION_ID%%.*})/$(rpm --eval '%{_arch}' | sed s/aarch/arm/)/" && dnf install -y nsight-systems-cli-2025.5.1.121
RUN RUN dnf install -y -q libnvjitlink-devel-12-8
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RUN RUN dnf install -y -q libnvjitlink-devel-12-8
RUN dnf install -y -q libnvjitlink-devel-12-8

Comment on lines +229 to +230
local table_dir="/var/lib/presto/data/hive/data/integration_test/tpch/$table_name"
local create_table=$(cat $sql_file | sed "s+{file_path}+$table_dir+g")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make the table dir configurable to register existing data (for example existing SF100 data in a directory)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should now be functional using the --data and/or --schema options.

-p, --profile Profile queries with nsys.
-q, --queries Set of benchmark queries to run. This should be a comma separate list of query numbers.
By default, all benchmark queries are run.
-l, --command-line Run queries via presto-cli instead of curl.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add --schema to help

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added --schema to the help options, as well as new options for --data and --coordinator.


parse_args "$@"
detect_containers
mkdir -p "$BASE_DIR/benchmark_output/tpch"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has permission issue for the first time because benchmark_output might have been created by docker by root user

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed it so that we make sure benchmark_output exists before we mount it (so docker is not responsible for creating it). This should fix the permissions issues.


parse_args "$@"
detect_containers
mkdir -p "$BASE_DIR/benchmark_output/tpch"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has permission issue because benchmark_output might have been created by docker by root user

run_outputs+=("$output_json")
echo "$output_json" > "$OUTPUT_DIR/Q$query.I$i.summary.json"
done
[ -z "$CREATE_PROFILES" ] || stop_profile
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we generate one profile that covers all non-warmup iterations for a query. Not sure if we want that, or only profiles on some (one) iteration(s)? Up for debate...

local processed_bytes=$(echo "$stats" | jq -r '.processedBytes // 0')
local cpu_time_ms=$(echo "$stats" | jq -r '.cpuTimeMillis // 0')
local wall_time_ms=$(echo "$stats" | jq -r '.wallTimeMillis // 0')
local elapsed_time_ms=$(echo "$stats" | jq -r '.elapsedTimeMillis // 0')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elapsedTimeMillis is the accurate query execution time.
other numbers are wrong.

Comment on lines +348 to +353
local start_time=$(date +%s.%N)
run_query "$sql" "$query"
local end_time=$(date +%s.%N)
[ -n "$FINAL_RESPONSE" ] && echo "$FINAL_RESPONSE" > "$OUTPUT_DIR/Q$query.I$i.out.json"
local execution_time=$(echo "$end_time - $start_time" | bc -l)
local output_json=$(filter_output "$query" "$execution_time" "$FINAL_RESPONSE")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't use this method to calculate the runtime. This will be vary a lot.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elapsedTimeMillis for curl method.
Try running time inside docker for presto-cli to get better estimate. (even that will be more than curl method)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or get elapsedTime and executionTime from

http://localhost:8080/v1/query/${id}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this so that the cli path obtains the elapsed time via the web UI, so we are no longer performing any timing ourselves (just relying on the times reported by presto).

@karthikeyann
Copy link
Copy Markdown
Contributor

As part of the PR, please create the queries.json files too and checkin

@misiugodfrey misiugodfrey marked this pull request as draft October 3, 2025 22:20
@misiugodfrey
Copy link
Copy Markdown
Contributor Author

Changed this to draft PR to reflect that it is no longer intended to land. It has been replaced with PR #56 which uses the python interface.

@simoneves
Copy link
Copy Markdown
Contributor

Changed this to draft PR to reflect that it is no longer intended to land. It has been replaced with PR #56 which uses the python interface.

Can we just close it, then? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants