Skip to content

[generate] move SinkCache to a custom_generate repo#38399

Merged
manueldeprada merged 1 commit into
huggingface:mainfrom
gante:rm_sink_cache
Jun 2, 2025
Merged

[generate] move SinkCache to a custom_generate repo#38399
manueldeprada merged 1 commit into
huggingface:mainfrom
gante:rm_sink_cache

Conversation

@gante

@gante gante commented May 27, 2025

Copy link
Copy Markdown
Contributor

What does this PR do?

Moves the deprecated SinkCache to a custom_generate repo: https://huggingface.co/transformers-community/sink_cache

Users importing SinkCache will see no issue for now. Users using SinkCache will get a message redirecting them to the Hub repo.

Hub usage example:

# requires `transformers>=4.52.0`
from transformers import AutoModelForCausalLM, AutoTokenizer

# Preparing model, tokenizer, and model inputs
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B", device_map="auto")
messages = [{"role": "user", "content": "Tell me a story about a cat."}]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Using sink cache
gen_out = model.generate(
    # usual `generate` arguments
    **model_inputs,
    do_sample=False,
    max_new_tokens=100,
    return_dict_in_generate=True,
    # sink cache arguments (default `window_length=256`)
    custom_generate="transformers-community/sink_cache",
    trust_remote_code=True,
)
print(tokenizer.batch_decode(gen_out.sequences, skip_special_tokens=True))
assert "sinkcache" in str(type(gen_out.past_key_values)).lower()
# ['user\nTell me a story about a cat.\nassistant\n<think>\n\n</think>\n\nOnce upon a time, in a cozy village nestled
# between rolling hills and a sparkling lake, there lived a cat named Luna. Luna was small and fluffy, with a curious
# eyes that sparkled with wonder. She had a soft, warm coat that shimmered like the morning sun, and her tail was
# always wagging in playful motions.\n\nOne day, while exploring the village, Luna noticed a curious sight: a young
# boy playing with a ball on the lake. She followed him closely, her heart racing']

@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@manueldeprada manueldeprada left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ArthurZucker ArthurZucker left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it had a deprecation cycle makes sense!

@manueldeprada manueldeprada merged commit beaed8c into huggingface:main Jun 2, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants