Detokenize with prompt token ids #1753

AllentDan · 2024-06-11T07:03:30Z

Please note that detokenizing with prompt token ids is slightly slower than without. The faster the model, the bigger the influence. In my testing on internlm2-chat-1_8b, it could be about 3% slower. (49.94 RPS vs 51.55 RPS).

Update: tested internlm2-chat-7b, the influence was so small that it could be dismissed (21.984 RPS vs 21.815 RPS). The current implement performs better which could be a measurement error.

detokenize with prompt token ids

40cf64d

lvhan028 added the Bug:P1 label Jun 13, 2024

lvhan028 self-requested a review June 17, 2024 10:37

AllentDan mentioned this pull request Jun 20, 2024

[Bug] Space is incorrectly removed from start of generated text for /v1/completion endpoint #1743

Closed

2 tasks

lvhan028 approved these changes Jun 22, 2024

View reviewed changes

lvhan028 merged commit fd0cefb into InternLM:main Jun 22, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detokenize with prompt token ids #1753

Detokenize with prompt token ids #1753

AllentDan commented Jun 11, 2024 •

edited

Loading

Detokenize with prompt token ids #1753

Detokenize with prompt token ids #1753

Conversation

AllentDan commented Jun 11, 2024 • edited Loading

AllentDan commented Jun 11, 2024 •

edited

Loading