`random_seed` seems to be ignored (or at least inconsistent) for inflight_batcher_llm #468

dyoshida-continua · 2024-05-21T23:12:19Z

System Info

I've converted Llama 3 using TensorRT-LLM's convert_checkpoint script, and am serving it with the inflight_batcher_llm template. I'm trying to get diverse samples for a fixed input, but I've found that if I make several requests concurrently, several will have identical outputs.

I'm setting top_p=1, top_k=1024, temperature=1.0, beam_width=1, and generating a unique random seed for each request. The requests are being made over the gRPC API, and I'm using v0.9.0 of TensorRT-LLM and tensorrtllm_backend.

Who can help?

@byshiue

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Serve a model (essentially following this guide, with some settings changes: https://developer.nvidia.com/blog/turbocharging-meta-llama-3-performance-with-nvidia-tensorrt-llm-and-nvidia-triton-inference-server/)
Make 5 gRPC requests concurrently

Expected behavior

I expect each request with a different seed to yield a different response

actual behavior

Several of the 5 responses are consistently identical

additional notes

I changed the script I'm using for testing to wait for a response before sending another request, and this results in all 5 outputs being distinct, so it seems like the concurrency/inflight batching really is the problem.

The text was updated successfully, but these errors were encountered:

dyoshida-continua · 2024-05-21T23:26:09Z

Another detail which is interesting is that the identical sequences I observe in the concurrent case are the same run to run, even though I'm sampling the random seed from 1-1,000,000.

For example, with the input of <|begin_of_text|>Hello, my name is, I saw a continuation of of "Ahmed, and I am an experienced Software Engineer with proficiency..." in 3/5 responses, and then 2/5 responses on the next run. I did not observe this prefix at all when making requests serially.

dyoshida-continua · 2024-06-05T22:46:09Z

@byshiue I incorrectly typed your name when opening this issue originally. Can you comment on whether there's a workaround for this? It's currently making batch inference effectively useless.

chiendb97 · 2024-06-07T06:04:02Z

@byshiue I incorrectly typed your name when opening this issue originally. Can you comment on whether there's a workaround for this? It's currently making batch inference effectively useless.

@dyoshida-continua I applied the solution described in this pull request: NVIDIA/TensorRT-LLM#1742, and it resolved the issue for me.

byshiue · 2024-06-07T06:27:03Z

Thank you for the help replying, @chiendb97 . Since the NVIDIA/TensorRT-LLM#1742 is related to fix of random seed setting, it might be related to your issue, @dyoshida-continua . Could you take a try?

dyoshida-continua added the bug Something isn't working label May 21, 2024

byshiue self-assigned this Jun 7, 2024

byshiue added the triaged Issue has been triaged by maintainers label Jun 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`random_seed` seems to be ignored (or at least inconsistent) for inflight_batcher_llm #468

`random_seed` seems to be ignored (or at least inconsistent) for inflight_batcher_llm #468

dyoshida-continua commented May 21, 2024 •

edited

Loading

dyoshida-continua commented May 21, 2024

dyoshida-continua commented Jun 5, 2024

chiendb97 commented Jun 7, 2024

byshiue commented Jun 7, 2024

random_seed seems to be ignored (or at least inconsistent) for inflight_batcher_llm #468

random_seed seems to be ignored (or at least inconsistent) for inflight_batcher_llm #468

Comments

dyoshida-continua commented May 21, 2024 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

dyoshida-continua commented May 21, 2024

dyoshida-continua commented Jun 5, 2024

chiendb97 commented Jun 7, 2024

byshiue commented Jun 7, 2024

`random_seed` seems to be ignored (or at least inconsistent) for inflight_batcher_llm #468

`random_seed` seems to be ignored (or at least inconsistent) for inflight_batcher_llm #468

dyoshida-continua commented May 21, 2024 •

edited

Loading