Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue when using batch manager #707

Closed
sleepwalker2017 opened this issue Dec 20, 2023 · 1 comment
Closed

Performance issue when using batch manager #707

sleepwalker2017 opened this issue Dec 20, 2023 · 1 comment
Assignees
Labels
stale triaged Issue has been triaged by maintainers

Comments

@sleepwalker2017
Copy link

sleepwalker2017 commented Dec 20, 2023

I'm using this commit a21e2f85178111fed9812bb88c2cc7411b25f0ba on 2 * A30, since the latest commit doesn't work for me. (See here :#649)

I find that, when I build engine with different max_batch_size, the performance differs a lot.

Here is the token throughput when building with difference max_batch_size, and run with different max_num_sequences.

You can see the performance is bad after max_batch_size is 32 and larger.

image

The benchmark script is

CUDA_VISIBLE_DEVICES=0,1 mpirun -n 2 --allow-run-as-root benchmarks/gptManagerBenchmark --model llama --engine_dir ../../examples/llama/./tmp/llama/13B/trt_engines/fp16/2-gpu/ --dataset /data/TensorRT-LLM/examples/llama/preprocessed_dataset_256.json --max_num_sequences $1

The build script is

python build.py --model_dir /data/weilong.yu/vicuna-13b/vicuna-13b-v1.5/ \
                --dtype float16 \
                --use_gpt_attention_plugin float16 \
                --use_gemm_plugin float16 \
                --output_dir ./tmp/llama/13B/trt_engines/fp16/2-gpu/ \
                --max_batch_size $1 \
                --tp_size 2 \
                --world_size 2 --parallel_build \
                --use_inflight_batching \
                --remove_input_padding \
                --paged_kv_cache \
                --enable_context_fmha
@byshiue byshiue added the triaged Issue has been triaged by maintainers label Dec 25, 2023
@hello-11
Copy link
Collaborator

@sleepwalker2017 Do you still have the problem? If not, we will close it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

5 participants