Skip to content

Move the 1st token finish time to not include 2nd step kv pad time#292

Merged
libinta merged 1 commit into
HabanaAI:habana-mainfrom
shepark:fix_1st_token_latency
Jul 11, 2024
Merged

Move the 1st token finish time to not include 2nd step kv pad time#292
libinta merged 1 commit into
HabanaAI:habana-mainfrom
shepark:fix_1st_token_latency

Conversation

@shepark
Copy link
Copy Markdown

@shepark shepark commented Jul 11, 2024

Move the 1st token finish time to not include 2nd step kv pad time. (PR from OH huggingface#1091)

Fixes # (SW-191601)

@libinta libinta merged commit def226c into HabanaAI:habana-main Jul 11, 2024
kalyanjk pushed a commit to kalyanjk/optimum-habana-fork that referenced this pull request Jul 15, 2024
@astachowiczhabana
Copy link
Copy Markdown

huggingface#1091

xinyu-intel pushed a commit that referenced this pull request Mar 4, 2025
)

Change-Id: I082aa87150b5c358f6a77f208fa196196e07e5b7
astachowiczhabana added a commit that referenced this pull request May 23, 2025
Co-authored-by: Adam Stachowicz <astachowicz@habana.ai>
astachowiczhabana added a commit that referenced this pull request May 23, 2025
Co-authored-by: Adam Stachowicz <astachowicz@habana.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants