Have trouble in using flan-t5-xxl #474

Vergissmeinicht · 2023-11-27T02:11:14Z

Been trying to use use flan-t5-xxl like enc_dec example but failed to get correct output from trt inference. The T5 (t5-small) and Flan-T5 (google/flan-t5-small) examples prove to be getting correct output. So i wonder whether flan-t5-xxl is supported or there's something with my building&running code. Hope to get your response. Many thanks.

symphonylyh · 2023-12-01T01:45:54Z

@Vergissmeinicht we're investigating an issue in #500 and it's probably related. Did you observe the same incorrect pattern that only the 1st token is correct? The error might be hidden in t5-small case so didn't get caught earlier.

symphonylyh · 2023-12-04T23:41:30Z

Recommend to try the v0.6.1 latest release: https://github.com/NVIDIA/TensorRT-LLM/releases/tag/v0.6.1 since there are several related bug fixes. Mark as closed for now. Please don't hesitate to reopen it if you still encounter the issue

Vergissmeinicht · 2023-12-05T07:00:11Z

Recommend to try the v0.6.1 latest release: https://github.com/NVIDIA/TensorRT-LLM/releases/tag/v0.6.1 since there are several related bug fixes. Mark as closed for now. Please don't hesitate to reopen it if you still encounter the issue

Problem solved! Many thanks.

symphonylyh · 2023-12-05T07:26:27Z

@Vergissmeinicht Nice to hear!

symphonylyh · 2023-12-13T08:51:21Z

@Vergissmeinicht important note: I found a tiny but important fix for Flan-T5.
As a temporary fix, please manually comment out these lines when you run Flan-T5 (but NOT T5!) — basically T5 has this rescaling while Flan-T5 doesn’t have, but currently v0.6.1 implementation will always fall into the default=True path.
This will be fixed soon in a week in v0.7 release and/or weekly dev release, but if you have Flan-T5 on the fly it would be good to apply the local fix first. Thanks

Update: already merged in latest main. Will be merged in v0.7 release soon!

shannonphu · 2023-12-18T19:14:12Z

@symphonylyh Thanks for the heads up! Just to confirm, this change only needs to happen when we perform inference ie https://github.com/NVIDIA/TensorRT-LLM/blob/rel/examples/enc_dec/run.py ? Do we need to rebuild the engines or image?

symphonylyh · 2023-12-18T21:12:27Z

@shannonphu Yes, you need to rebuild the engine. Because the change is in model.py, which directly maps to what the engine is doing, engines have to be rebuilt. If a change is made in run.py without touching model.py, usually the engines can be reused.

Also update: the above fix has been merged in latest main and soon in v0.7. You can just update to main, rebuilt engines, and everything should work fine

jdemouth-nvidia assigned symphonylyh Nov 27, 2023

jdemouth-nvidia added the triaged Issue has been triaged by maintainers label Nov 27, 2023

symphonylyh closed this as completed Dec 4, 2023

This was referenced Dec 13, 2023

The cross attention in the gpt_attention_plugin seems not to be working #500

Closed

enc_dec model results are not aligned with the HF model #612

Closed

Building T5 encoder could not find any supported formats consistent with input/output data types #520

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have trouble in using flan-t5-xxl #474

Have trouble in using flan-t5-xxl #474

Vergissmeinicht commented Nov 27, 2023

symphonylyh commented Dec 1, 2023

symphonylyh commented Dec 4, 2023

Vergissmeinicht commented Dec 5, 2023

symphonylyh commented Dec 5, 2023

symphonylyh commented Dec 13, 2023 •

edited

Loading

shannonphu commented Dec 18, 2023

symphonylyh commented Dec 18, 2023

Have trouble in using flan-t5-xxl #474

Have trouble in using flan-t5-xxl #474

Comments

Vergissmeinicht commented Nov 27, 2023

symphonylyh commented Dec 1, 2023

symphonylyh commented Dec 4, 2023

Vergissmeinicht commented Dec 5, 2023

symphonylyh commented Dec 5, 2023

symphonylyh commented Dec 13, 2023 • edited Loading

shannonphu commented Dec 18, 2023

symphonylyh commented Dec 18, 2023

symphonylyh commented Dec 13, 2023 •

edited

Loading