Skip to content

vocab: Support tokenizer for LFM2.5-8B-A1B#23826

Merged
CISC merged 2 commits into
ggml-org:masterfrom
tdakhran:tarek/feat/liquid7-tokenizer
May 29, 2026
Merged

vocab: Support tokenizer for LFM2.5-8B-A1B#23826
CISC merged 2 commits into
ggml-org:masterfrom
tdakhran:tarek/feat/liquid7-tokenizer

Conversation

@tdakhran
Copy link
Copy Markdown
Contributor

@tdakhran tdakhran commented May 28, 2026

Overview

LFM2.5-8B-A1B shares architecture with LFM2-8B-A1B but comes with a new extended tokenizer.

This PR adds support for it.

GGUFs are uploaded to LiquidAI/LFM2.5-8B-A1B-GGUF

Requirements

@tdakhran tdakhran requested a review from CISC as a code owner May 28, 2026 15:53
@github-actions github-actions Bot added the python python script changes label May 28, 2026
Copy link
Copy Markdown
Contributor

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Copy link
Copy Markdown
Member

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why both in pre_computed_hashes?

Copy link
Copy Markdown
Member

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, looking more closely the LFM2 pre-tokenizer does not match LFM2.5, it uses a new regex:

'(?i:[sdmt]|ll|ve|re)|[^\\r\\n\\p{L}\\p{N}]?\\p{L}+|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]|\\s+(?!\\S)|\\s

@tdakhran
Copy link
Copy Markdown
Contributor Author

Thanks for the feedback @CISC .

Why both in pre_computed_hashes?

I tried to place them both in the table, but then re-running convert_hf_to_gguf_update.py leaves conversion/base.py in an incorrect state. Then I followed falcon-h1 example and placed it into pre_computed_hashes.

Hmmm, looking more closely the LFM2 pre-tokenizer does not match LFM2.5, it uses a new regex:

on regex, the difference is minimal, and testing on real-life use cases didn't show any difference.

However, we discovered that tool calling doesn't work with this chat template in llama.cpp (works in other frameworks) and currently debugging it.

@CISC
Copy link
Copy Markdown
Member

CISC commented May 29, 2026

Why both in pre_computed_hashes?

I tried to place them both in the table, but then re-running convert_hf_to_gguf_update.py leaves conversion/base.py in an incorrect state. Then I followed falcon-h1 example and placed it into pre_computed_hashes.

One (the original) should stay in models, any duplicates go in pre_computed_hashes.

Hmmm, looking more closely the LFM2 pre-tokenizer does not match LFM2.5, it uses a new regex:

on regex, the difference is minimal, and testing on real-life use cases didn't show any difference.

Still, since there is an actual difference it would be prudent to add an lfm2.5 pre-tokenizer.

@tdakhran tdakhran force-pushed the tarek/feat/liquid7-tokenizer branch from 964007f to 6f8fa55 Compare May 29, 2026 15:56
@tdakhran
Copy link
Copy Markdown
Contributor Author

@CISC , we reworked the chat template to use a similar regex to lfm2 here https://huggingface.co/LiquidAI/LFM2.5-8B-A1B/discussions/5 .

I moved back existing tokenizer to models, hope it looks good to merge!

@tdakhran tdakhran requested a review from CISC May 29, 2026 15:59
@CISC
Copy link
Copy Markdown
Member

CISC commented May 29, 2026

@CISC , we reworked the chat template to use a similar regex to lfm2 here https://huggingface.co/LiquidAI/LFM2.5-8B-A1B/discussions/5 .

I moved back existing tokenizer to models, hope it looks good to merge!

That works too I guess. :)

@CISC CISC merged commit 2084434 into ggml-org:master May 29, 2026
7 checks passed
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request May 29, 2026
* origin/master:
vocab : support tokenizer for LFM2.5-8B-A1B (ggml-org#23826)
graph : ensure DS32 kq_mask_lid is F32 (ggml-org#23864)
server: remove obsolete scripts (ggml-org#23870)
ci : update macos release to use macos-26 runner (ggml-org#23878)
download: add option to skip_download (ggml-org#23059)
mtmd: Add DeepSeekOCR 2 Support (ggml-org#20975)
CUDA: Check PTX version on host side to guard PDL dispatch (ggml-org#23530)
server: bump timeout to 3600s (ggml-org#23842)
model : support for DeepseekV32ForCausalLM with generic DeepSeek Sparse Attention (DSA) implementation (ggml-org#23346)
llama: use f16 mask for FA to save VRAM (ggml-org#23764)
sync : ggml
ggml : bump version to 0.13.1 (ggml/1523)
ngram-mod : Add missing include (ggml-org#23857)
llama: add llm_graph_input_mtp (ggml-org#23643)
app : move licences to llama-app (ggml-org#23824)
cuda : disables launch_fattn PDL enrollment due to compiler bug (ggml-org#23825)
meta : Add missing `buffer` set in allreduce fallback !COMPUTE clear (ggml-org#23480)
@tdakhran tdakhran deleted the tarek/feat/liquid7-tokenizer branch May 29, 2026 21:11
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
* vocab: Support tokenizer for LFM2.5-8B-A1B

* Keep liquid6 tokenizer in models
turbo-tan pushed a commit to turbo-tan/llama.cpp-tq3 that referenced this pull request Jun 2, 2026
* vocab: Support tokenizer for LFM2.5-8B-A1B

* Keep liquid6 tokenizer in models
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants