Conversation
| ENV EXTRA_CMAKE_FLAGS=${EXTRA_CMAKE_FLAGS} | ||
| ENV NUM_THREADS=${NUM_THREADS} | ||
|
|
||
| RUN rpm --import https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub && dnf config-manager --add-repo "https://developer.download.nvidia.com/devtools/repos/rhel$(source /etc/os-release; echo ${VERSION_ID%%.*})/$(rpm --eval '%{_arch}' | sed s/aarch/arm/)/" && dnf install -y nsight-systems-cli-2025.5.1.121 |
There was a problem hiding this comment.
Install a pinned version of nsys to the container.
| echo "/usr/lib64/presto-native-libs" > /etc/ld.so.conf.d/presto_native.conf | ||
|
|
||
| CMD bash -c "ldconfig && presto_server --etc-dir=/opt/presto-server/etc" | ||
| CMD bash -c "ldconfig && nsys launch presto_server --etc-dir=/opt/presto-server/etc" |
There was a problem hiding this comment.
did this work? I was just assuming this would work but @karthikeyann tried it last night and couldn't get it to work until he started the container in interactive mode and manually ran nsys launch presto_server ...
There was a problem hiding this comment.
This has been working for me - at least in so far as it generates profiles. What issue was he running into when he attempted this approach?
There was a problem hiding this comment.
some of the arguments need to be given during nsys launch itself.
Please add required arguments for nsys here. I got this from our old velox benchmark scripts.
CMD bash -c "ldconfig && nsys launch -t nvtx,cuda,osrt \
--cuda-memory-usage=true \
--cuda-um-cpu-page-faults=true \
--cuda-um-gpu-page-faults=true \
presto_server --etc-dir=/opt/presto-server/etc"
There was a problem hiding this comment.
add --gpu-metrics-devices, --cudabacktrace
|
May I request options to include certain nsys options? Personally, I'm using |
… misiug/Benchmarking
… misiug/Benchmarking
… misiug/Benchmarking
I've added the launch params, but |
|
@misiugodfrey thanks for this. I think we should:
|
@tmostak these are good ideas, but I think it would be better to add them as new features in a subsequent PR. I would rather this focus on how we want runs to occur, and then we can add further tuning and hot/cold profiling once we have settled on the basics. |
presto/scripts/run_benchmarks.sh
Outdated
| if echo "$images" | grep -q "presto-native-worker-cpu"; then | ||
| [[ -n $WORKER ]] & echo_error "mismatch in worker types" && exit 1 |
There was a problem hiding this comment.
Why does it disallow running with cpu or java?
There was a problem hiding this comment.
It's not disallowing those runs (or if it is, that's a bug); it's checking to see if we already have workers of other types running. It's to make sure we aren't running java workers alongside cpu or gpu workers as the profiling expects us to only be running a single worker type (and a single worker for that matter).
There was a problem hiding this comment.
@misiugodfrey I think there is an issue with & echo_error, it should be && echo_error.
karthikeyann
left a comment
There was a problem hiding this comment.
I recreated velox-testing
- with this PR #33,
- with presto PR (prestodb/presto#25899),
- with velox branch (https://github.com/rapidsai/velox/tree/merged-prs)
I added the changes in this PR review to make it work on a fress system (arm).
velox-testing/presto/scripts$ ./run_integ_test.sh -b tpch passed (Q15 unstable due to know issue: floating point join key)
| ENV NUM_THREADS=${NUM_THREADS} | ||
|
|
||
| RUN rpm --import https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub && dnf config-manager --add-repo "https://developer.download.nvidia.com/devtools/repos/rhel$(source /etc/os-release; echo ${VERSION_ID%%.*})/$(rpm --eval '%{_arch}' | sed s/aarch/arm/)/" && dnf install -y nsight-systems-cli-2025.5.1.121 | ||
| RUN RUN dnf install -y -q libnvjitlink-devel-12-8 |
There was a problem hiding this comment.
| RUN RUN dnf install -y -q libnvjitlink-devel-12-8 | |
| RUN dnf install -y -q libnvjitlink-devel-12-8 |
presto/scripts/run_benchmarks.sh
Outdated
| local table_dir="/var/lib/presto/data/hive/data/integration_test/tpch/$table_name" | ||
| local create_table=$(cat $sql_file | sed "s+{file_path}+$table_dir+g") |
There was a problem hiding this comment.
please make the table dir configurable to register existing data (for example existing SF100 data in a directory)
There was a problem hiding this comment.
This should now be functional using the --data and/or --schema options.
| -p, --profile Profile queries with nsys. | ||
| -q, --queries Set of benchmark queries to run. This should be a comma separate list of query numbers. | ||
| By default, all benchmark queries are run. | ||
| -l, --command-line Run queries via presto-cli instead of curl. |
There was a problem hiding this comment.
please add --schema to help
There was a problem hiding this comment.
Added --schema to the help options, as well as new options for --data and --coordinator.
presto/scripts/run_benchmarks.sh
Outdated
|
|
||
| parse_args "$@" | ||
| detect_containers | ||
| mkdir -p "$BASE_DIR/benchmark_output/tpch" |
There was a problem hiding this comment.
This has permission issue for the first time because benchmark_output might have been created by docker by root user
There was a problem hiding this comment.
I've changed it so that we make sure benchmark_output exists before we mount it (so docker is not responsible for creating it). This should fix the permissions issues.
presto/scripts/run_benchmarks.sh
Outdated
|
|
||
| parse_args "$@" | ||
| detect_containers | ||
| mkdir -p "$BASE_DIR/benchmark_output/tpch" |
There was a problem hiding this comment.
This has permission issue because benchmark_output might have been created by docker by root user
…ox-testing into misiug/Benchmarking
…ox-testing into misiug/Benchmarking
| run_outputs+=("$output_json") | ||
| echo "$output_json" > "$OUTPUT_DIR/Q$query.I$i.summary.json" | ||
| done | ||
| [ -z "$CREATE_PROFILES" ] || stop_profile |
There was a problem hiding this comment.
Right now we generate one profile that covers all non-warmup iterations for a query. Not sure if we want that, or only profiles on some (one) iteration(s)? Up for debate...
presto/scripts/run_benchmarks.sh
Outdated
| local processed_bytes=$(echo "$stats" | jq -r '.processedBytes // 0') | ||
| local cpu_time_ms=$(echo "$stats" | jq -r '.cpuTimeMillis // 0') | ||
| local wall_time_ms=$(echo "$stats" | jq -r '.wallTimeMillis // 0') | ||
| local elapsed_time_ms=$(echo "$stats" | jq -r '.elapsedTimeMillis // 0') |
There was a problem hiding this comment.
elapsedTimeMillis is the accurate query execution time.
other numbers are wrong.
presto/scripts/run_benchmarks.sh
Outdated
| local start_time=$(date +%s.%N) | ||
| run_query "$sql" "$query" | ||
| local end_time=$(date +%s.%N) | ||
| [ -n "$FINAL_RESPONSE" ] && echo "$FINAL_RESPONSE" > "$OUTPUT_DIR/Q$query.I$i.out.json" | ||
| local execution_time=$(echo "$end_time - $start_time" | bc -l) | ||
| local output_json=$(filter_output "$query" "$execution_time" "$FINAL_RESPONSE") |
There was a problem hiding this comment.
please don't use this method to calculate the runtime. This will be vary a lot.
There was a problem hiding this comment.
elapsedTimeMillis for curl method.
Try running time inside docker for presto-cli to get better estimate. (even that will be more than curl method)
There was a problem hiding this comment.
or get elapsedTime and executionTime from
http://localhost:8080/v1/query/${id}
There was a problem hiding this comment.
I've changed this so that the cli path obtains the elapsed time via the web UI, so we are no longer performing any timing ourselves (just relying on the times reported by presto).
|
As part of the PR, please create the queries.json files too and checkin |
|
Changed this to draft PR to reflect that it is no longer intended to land. It has been replaced with PR #56 which uses the python interface. |
Can we just close it, then? :) |
Add a simple benchmarking script that will run benchmarks based on what is in the testing data directory. Offers optional profiling using nsys.
NOTE: There is ongoing work to perform this kind of benchmarking using python scripts (that will re-use a lot of our integration test work - de-duplicating a lot of code), so this item will likely stand as a placeholder until it can be replaced with the preferred python option once we verify they behave the same.