- 
                Notifications
    You must be signed in to change notification settings 
- Fork 601
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
System Info
Python 3.10, optimum @ main, transformers @ mainWho can help?
Information
- The official example scripts
- My own modified scripts
Tasks
-  An officially supported task in the examplesfolder (such as GLUE/SQuAD, ...)
- My own task or dataset (give details below)
Reproduction (minimal, reproducible, runnable)
Reproduction:
from transformers import AutoTokenizer, AutoModelForCausalLM
from optimum.bettertransformer import BetterTransformer
import torch
model_id = "tiiuae/falcon-rw-1b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
model = BetterTransformer.transform(model)
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10)Falcon attention was refactored in huggingface/transformers@05ea7b7#diff-81c616a9db6f569c579ccf03c30c2f69aa7b65fa40959ac7e882fb8d541891d7. This removed the property maybe_rotary and adopted llama conventions for rotary embeddings.
We could modify the use of maybe_rotary here by using something like:
        submodules = ["query_key_value", "dense", "attention_dropout"]
        if not config.alibi:
            submodules.append("rotary_emb")And then we'd need to adapt the code here, applying rotary embeddings when alibi is not in use.
Expected behavior
Transformation would work.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working