Skip to content

Conversation

@kashif
Copy link
Collaborator

@kashif kashif commented May 22, 2025

No description provided.

@kashif kashif marked this pull request as draft May 22, 2025 11:03
@kashif kashif marked this pull request as ready for review May 22, 2025 13:39
@kashif kashif requested review from ariG23498 and lusxvr May 22, 2025 13:39
@kashif kashif changed the title position ids kv-cache position ids for kv-cache May 22, 2025
@lusxvr
Copy link
Contributor

lusxvr commented May 22, 2025

How does this differ from @ariG23498 's PR (#69)? Did you test it with the benchmark scripts if there is a noticeable speedup?

@kashif
Copy link
Collaborator Author

kashif commented May 22, 2025

@lusxvr this is a fix to PR #69

@lusxvr
Copy link
Contributor

lusxvr commented May 22, 2025

Shouldn't we put it in the same PR then? Or is is also a general fix? I am a bit confused what exactly it does

@kashif
Copy link
Collaborator Author

kashif commented May 22, 2025

I didn't have permissions, so I made a PR on the PR... sorry about the confusion. It adds the position_id so that the rope position embeddings are added correctly, and then we sample a new token sequentially using the kv-cache up till the max_new_token lengths in the generate

@kashif
Copy link
Collaborator Author

kashif commented May 22, 2025

@ariG23498 will test it out in the morning I believe

@lusxvr
Copy link
Contributor

lusxvr commented May 22, 2025

Ah okay, I see. I added you to the repo though, you should have permission!

Copy link
Collaborator

@ariG23498 ariG23498 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loved the implementation.

@ariG23498 ariG23498 merged commit 08e512a into huggingface:kv-cache May 23, 2025
sang-nguyen-ts pushed a commit to tsdocode/nanoVLM that referenced this pull request May 28, 2025
* position ids for rope

* cleanup

* no need for mask

* no mask

* more cleanup

* add back filtering

* more cleanup

* revert the signature of llm's generate and forward

* use self.decoder.lm_use_tokens

* use torch inference_mode

* add back comment

* fix bug

* add back comments

* add back comments
pgryko pushed a commit to pgryko/nanoVLM that referenced this pull request Jul 6, 2025
* position ids for rope

* cleanup

* no need for mask

* no mask

* more cleanup

* add back filtering

* more cleanup

* revert the signature of llm's generate and forward

* use self.decoder.lm_use_tokens

* use torch inference_mode

* add back comment

* fix bug

* add back comments

* add back comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants