Add tokenizer test + revert to C++11 #350

ggerganov · 2023-03-21T09:42:20Z

Add test-tokenizer-0 to do a few tokenizations - feel free to expand
Added option to convert-pth-to-ggml.py script to dump just the vocabulary
Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests)
Added utility to load vocabulary file from previous point (temporary implementation)
Revert std::string_view changes and drop back to C++11
Rename gpt_vocab -> llama_vocab
All CMake binaries go into ./bin/ now

Need help to resolve Windows CI and merge

…core Migrate to scikit-build-core

…t-build-core" This reverts commit fb2c5f7, reversing changes made to 202ed44.

ggerganov added 2 commits March 21, 2023 11:32

Add tokenizer unit test + vocab-only data for tests

ecd982d

Revert back to C++11

11d84b2

ggerganov closed this Mar 21, 2023

Deadsg pushed a commit to Deadsg/llama.cpp that referenced this pull request Dec 19, 2023

Merge pull request ggml-org#350 from abetlen/migrate-to-scikit-build-…

fb2c5f7

…core Migrate to scikit-build-core

Deadsg pushed a commit to Deadsg/llama.cpp that referenced this pull request Dec 19, 2023

Revert "Merge pull request ggml-org#350 from abetlen/migrate-to-sciki…

e3542b6

…t-build-core" This reverts commit fb2c5f7, reversing changes made to 202ed44.

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add tokenizer test + revert to C++11 #350

Add tokenizer test + revert to C++11 #350

Uh oh!

ggerganov commented Mar 21, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Add tokenizer test + revert to C++11 #350

Add tokenizer test + revert to C++11 #350

Uh oh!

Conversation

ggerganov commented Mar 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ggerganov commented Mar 21, 2023 •

edited

Loading