Skip to content

tests : refactor test-save-load-state to accept token input#24073

Merged
ggerganov merged 2 commits into
masterfrom
gg/test-save-load-state-refactor
Jun 4, 2026
Merged

tests : refactor test-save-load-state to accept token input#24073
ggerganov merged 2 commits into
masterfrom
gg/test-save-load-state-refactor

Conversation

@ggerganov

@ggerganov ggerganov commented Jun 3, 2026

Copy link
Copy Markdown
Member

Overview

Refactor test-save-load-state.cpp to accept input as a token vector instead of a prompt string. This enables testing with models that lack a tokenizer.

Key changes:

  • Default prompt is now empty; when not provided, generates n_batch random tokens
  • Tokenization happens once upfront in main(); tokens are passed to all test functions
  • generate_tokens() prints token IDs instead of decoded text pieces

Additional information

The random token path is useful for model files that do not have a tokenizer, allowing the save/load state tests to run against any model.

Requirements

- Default prompt is now empty; when not provided, generate n_batch
  random tokens (useful for models without a tokenizer)
- Tokenization happens once upfront; pass token vector to test functions
- generate_tokens prints token IDs instead of decoded pieces
- Use llama_model_get_vocab / llama_vocab_n_tokens API
- Upgrade log level from LOG_TRC to LOG_INF for visibility

Assisted-by: llama.cpp:local pi
@github-actions github-actions Bot added the testing Everything test related label Jun 3, 2026
@ggerganov ggerganov marked this pull request as ready for review June 3, 2026 12:25
@ggerganov ggerganov merged commit 65ef50a into master Jun 4, 2026
1 check passed
@ggerganov ggerganov deleted the gg/test-save-load-state-refactor branch June 4, 2026 05:06
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Jun 4, 2026
* origin/master: (57 commits)
server : disable on-device spec checkpoints (ggml-org#24108)
arg: fix double mtp downloads (ggml-org#24128)
webui: [a11y] fix keyboard navigation issues in chat interface and sidebar (ggml-org#23132)
Move duplicated imatrix code into single common imatrix-loader.cpp (ggml-org#22445)
ui: Fixed packages (ggml-org#24119)
ui: added single line reasoning preview (ggml-org#23601)
return filter to save memory (ggml-org#24125)
convert: Fix Gemma 4 Unified conversion (ggml-org#24118)
ggml: vectorize ggml_vec_dot_q4_1_q8_1 with WASM SIMD128 (ggml-org#22209)
server: avoid unnecessary checkpoint restore when new tokens are present (ggml-org#24110)
agents: refactor, include more guidelines (ggml-org#24111)
webui: fix tool selector toggle/counter, key tools by stable identity (ggml-org#24065)
build : use umbrella Headers directory for XCFramework module map (ggml-org#23974)
server : add header to tools/server/server-http.h (ggml-org#24089)
cmake: skip cvector-generator and export-lora when CPU backend is disabled (ggml-org#24053)
fix(mtmd): handle Gemma 4 audio projector embedding size (ggml-org#24091)
readme : add status badges (ggml-org#24104)
tests : refactor test-save-load-state to accept token input (ggml-org#24073)
metal : reduce rset heartbeat from 500ms -> 5ms (ggml-org#24074)
ggml-webgpu: FlashAttention refactor + standardize quantization support (ggml-org#23834)
...
jimbothigpen pushed a commit to jimbothigpen/llama.cpp that referenced this pull request Jun 6, 2026
…#24073)

* tests : refactor test-save-load-state to accept token input

- Default prompt is now empty; when not provided, generate n_batch
  random tokens (useful for models without a tokenizer)
- Tokenization happens once upfront; pass token vector to test functions
- generate_tokens prints token IDs instead of decoded pieces
- Use llama_model_get_vocab / llama_vocab_n_tokens API
- Upgrade log level from LOG_TRC to LOG_INF for visibility

Assisted-by: llama.cpp:local pi

* cont : use llama_tokens alias

(cherry picked from commit 65ef50a)
jimbothigpen pushed a commit to jimbothigpen/llama.cpp that referenced this pull request Jun 6, 2026
…#24073)

* tests : refactor test-save-load-state to accept token input

- Default prompt is now empty; when not provided, generate n_batch
  random tokens (useful for models without a tokenizer)
- Tokenization happens once upfront; pass token vector to test functions
- generate_tokens prints token IDs instead of decoded pieces
- Use llama_model_get_vocab / llama_vocab_n_tokens API
- Upgrade log level from LOG_TRC to LOG_INF for visibility

Assisted-by: llama.cpp:local pi

* cont : use llama_tokens alias

(cherry picked from commit 65ef50a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant