Skip to content

Conversation

@vasqu
Copy link
Contributor

@vasqu vasqu commented Sep 5, 2024

Since the implementation seems like a mirror to the one of the transformers repo, I thought I'd share some fixes post-merge that have been applied:

  1. Generation when passing inputs_embeds has been broken, ref. Fix: Mamba2 generation mismatch between input_ids and inputs_embeds huggingface/transformers#32694
  2. Incorrect usage of the norm_before_gate flag which causes misaligned training when using the completely fused kernel, ref. Fix: Mamba2 norm_before_gate usage huggingface/transformers#32686

Cool repo btw :)

@yzhangcs yzhangcs merged commit 71bb93c into fla-org:main Sep 5, 2024
@yzhangcs
Copy link
Member

yzhangcs commented Sep 5, 2024

Thank you :D

@vasqu vasqu deleted the ma2-post-merge-fixes branch September 5, 2024 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants