Skip to content

Conversation

@vivekmig
Copy link
Contributor

Default generation in transformers utilizes past_key_values to cache previous key values to speed up forward passes for subsequent tokens. This adds a flag and use of corresponding helpers from transformers generation utils to follow the same approach for using caching.

@facebook-github-bot
Copy link
Contributor

@vivekmig has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@vivekmig merged this pull request in fefcb5b.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants