Add backwards-compatible support for multiple EOS tokens by hudson-ai · Pull Request #305 · guidance-ai/llguidance

hudson-ai · 2026-03-06T01:31:30Z

Motivation

Models like Qwen 3/3.5 define multiple EOS token IDs in their GenerationConfig (e.g. [151645, 151643]), but llguidance only supported a single EOS token. Qwen 3 uses <|im_end|> (151645) to end turns in chat mode and <|endoftext|> (151643) as a general end-of-text marker. When llguidance is configured with only one of these (e.g. the tokenizer's default 151645), the other gets masked out. If the model tries to emit the masked EOS, it's forced to pick garbage tokens and enters an infinite repetition loop, never terminating.

Closes #253
Related: #304

Changes

Add eos_tokens: Vec<TokenId> to TokTrie with accessors, with_eos_tokens() builder, and validation (asserts IDs are within vocab range)
Update TokenParser to check the full EOS set for mask computation, token consumption, rollback, and stop detection
C API: LlgTokenizerInit is unchanged. New LlgTokenizerInitV2 struct (flat, with struct_size for forward compatibility) + llg_new_tokenizer_v2() function for multi-EOS support
- struct_size enables forward compatibility: the FFI function takes a raw pointer, reads only struct_size bytes, and zero-fills any new fields the caller's header doesn't know about
- llg_new_tokenizer_v2() validates EOS token IDs against vocab size and returns an error (not a panic) for out-of-range IDs
Python eos_token parameter now accepts int | list[int] across all entry points
Add eos_tokens getter property to Python LLTokenizer
Update type stubs and all Python helper modules (hf, tiktoken, llamacpp)
C sample tests both v1 and v2 APIs end-to-end

Usage

Python:

from llguidance.hf import from_tokenizer
# Pass multiple EOS tokens
tok = from_tokenizer(hf_tokenizer, eos_token=[151645, 151643])
# Single int still works as before
tok = from_tokenizer(hf_tokenizer, eos_token=151645)

C (v2 API):

LlgTokenizerInitV2 init = {};
init.struct_size = sizeof(init);
init.vocab_size = vocab_size;
init.tok_eos = 151645;
init.tokenize_fn = my_tokenize_fn;
// ...set other fields...
LlgToken extra_eos[] = {151643};
init.tok_eos_extra = extra_eos;
init.tok_eos_extra_count = 1;
LlgTokenizer *tok = llg_new_tokenizer_v2(&init, err, sizeof(err));

Rust:

let mut byte_tok = ByteTokenizer::from_json_str(&tokenizer_json)?;
byte_tok.set_eos_tokens(&[151645, 151643]);
let tok_env = byte_tok.into_tok_env(None)?;

API compatibility

Python: Fully backwards compatible. eos_token still accepts a single int everywhere. The only additions are that it also accepts list[int], and there's a new eos_tokens property.

C API: Fully backwards compatible. LlgTokenizerInit is identical to its pre-PR layout — zero fields added or removed. llg_new_tokenizer() is unchanged. Multi-EOS requires the new LlgTokenizerInitV2 + llg_new_tokenizer_v2(), which are purely additive.

Rust (published crates — toktrie, toktrie_hf_tokenizers, toktrie_tiktoken): Only additive changes — new methods like with_eos_tokens(), eos_tokens(), set_eos_tokens(). No existing signatures changed.

Rust (python_ext — not published): tokenv_from_llamacpp changed from eos_token: u32 to eos_tokens: &[u32]. This is technically a breaking signature change, but python_ext is only consumed internally to build the Python wheel, so no external Rust consumers are affected.

Known limitations

TokRxInfo.tok_eos still holds only the first (primary) EOS token. Code that reads tok_eos directly rather than going through TokTrie::eos_tokens() will only see the primary one.
with_info() resets eos_tokens back to vec![info.tok_eos], silently dropping extra EOS tokens. Callers that replace TokRxInfo after setting multi-EOS must re-apply with_eos_tokens().

Models like Qwen 3/3.5 define multiple EOS token IDs in their GenerationConfig (e.g. [151645, 151643]), but llguidance only supported a single EOS token. This caused models to enter infinite loops when they tried to emit an EOS token that was masked out. Changes: - Add eos_tokens: Vec<TokenId> to TokTrie with accessors and with_eos_tokens() builder - Update TokenParser to check full EOS set for mask computation, token consumption, rollback, and stop detection - Add tok_eos_extra/tok_eos_extra_count to C API LlgTokenizerInit - Python eos_token parameter now accepts int | list[int] - Add eos_tokens getter property to Python LLTokenizer - Update type stubs and all Python helper modules (hf, tiktoken, llamacpp) All existing APIs remain unchanged; single EOS usage is unaffected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds multi-EOS-token support across the tokenizer/trie layers and the parser so models that define multiple EOS IDs (e.g., Qwen/GLM) can terminate correctly without masking alternative EOS tokens.

Changes:

Extend TokTrie/tokenizer wrappers to carry a list of EOS token IDs (while keeping a primary EOS for compatibility).
Update TokenParser to treat any configured EOS token as valid for mask computation and stop detection.
Plumb multi-EOS through the C API and Python bindings/helpers (including typing updates).

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`toktrie_tiktoken/src/lib.rs`	Adds a setter to apply multiple EOS tokens to the internal `TokTrie`.
`toktrie_hf_tokenizers/src/lib.rs`	Tracks extra EOS tokens and applies them when constructing the `TokTrie`.
`toktrie/src/toktree.rs`	Adds `eos_tokens` storage + builder/accessor; adds unit tests for new behavior.
`parser/src/tokenparser.rs`	Switches EOS handling from single token to a token set for masking/stop logic.
`parser/src/ffi.rs`	Extends C tokenizer initialization to accept additional EOS token IDs.
`parser/llguidance.h`	Exposes the new C init fields for extra EOS token IDs.
`c_sample/c_sample.cpp`	Documents how to pass multiple EOS tokens via the C API.
`python_ext/src/py.rs`	Accepts `eos_token` as `int
`python_ext/src/llamatokenizer.rs`	Updates llama.cpp bridge to accept multiple EOS token IDs and apply them to `TokTrie`.
`python/llguidance/hf.py`	Updates helper typing/docs to allow `eos_token` as `int
`python/llguidance/tiktoken.py`	Updates helper typing/docs to allow `eos_token` as `int
`python/llguidance/llamacpp.py`	Updates helper typing/docs to allow `eos_token` as `int
`python/llguidance/_lib.pyi`	Updates stubs for new `eos_tokens` property and widened `eos_token` parameter type.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sempervictus · 2026-03-06T02:02:29Z

@hudson-ai what would the Rust mechanism be for those when creating an llg factory from a tokenzier?

EDIT: also does this mean i should hold off on hacking-together that text_or_eos bit and just pull the next microversion or are release cycles a bit longer around here? vllm.rs is moving quick so i can always undo a hack later

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Document that LlgTokenizerInit must be zero-initialized before setting fields, as new fields may be appended in future versions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

hudson-ai · 2026-03-06T02:33:42Z

@hudson-ai what would the Rust mechanism be for those when creating an llg factory from a tokenzier?

EDIT: also does this mean i should hold off on hacking-together that text_or_eos bit and just pull the next microversion or are release cycles a bit longer around here? vllm.rs is moving quick so i can always undo a hack later

If you're working from a tokenizer JSON, you'll want to do something like this:

use toktrie_hf_tokenizers::ByteTokenizer;
use llguidance::{ParserFactory, api::TopLevelGrammar};
use llguidance::toktrie::InferenceCapabilities;
use llguidance::earley::SlicedBiasComputer;

let mut byte_tok = ByteTokenizer::from_json_str(&tokenizer_json)?;
// Probably get this from the generation_config.json?
byte_tok.set_eos_tokens(&[151645, 151643]);
let tok_env = byte_tok.into_tok_env(None)?;

let factory = ParserFactory::new(
    &tok_env,
    InferenceCapabilities::default(),
    &SlicedBiasComputer::general_slices(),
)?;

// factory now produces parsers that allow both EOS tokens in masks
let grammar = TopLevelGrammar::from_lark(r#"start: /[a-z]+/"#.to_string());
let parser = factory.create_parser(grammar)?;

If you already have a TokTrie, you can call trie.with_eos_tokens(&[151645, 151643]) instead and create a TokEnv from there, but the ByteTokenizer path above is probably closest to what you're doing.

RE: holding off on the "hack" -- I think that I can reasonably get a release out in the next week or so, but no strong guarantee. Need to await some review on this PR too. But if you don't mind doing and undoing a hack, go for it 😉

Handle runaway model output in "normal grammar" modality masking possible EOS tokens and producing nonsensical output once the model has completed its normal tool-calls and chat stream: - guidance-ai/llguidance#304 - guidance-ai/llguidance#305

Move tok_eos_extra/tok_eos_extra_count out of LlgTokenizerInit into a new LlgTokenizerInitV2 struct that embeds the original as its 'base' field. This keeps LlgTokenizerInit identical to its pre-multi-EOS layout, avoiding any ABI break for existing C consumers. Add llg_new_tokenizer_v2() which accepts the v2 struct. The original llg_new_tokenizer() continues to work unchanged with single-EOS. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The leading struct_size field (set to sizeof(LlgTokenizerInitV2) by callers) lets the library detect which fields are present. Future fields can be appended to the struct without a v3 — callers compiled against an older header will simply have a smaller struct_size, and new fields will be treated as zero/default. llg_new_tokenizer_v2() validates struct_size >= the minimum expected size and returns an error if it's unset or too small. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Replace the nested 'base: LlgTokenizerInit' member with flat copies of all fields so C consumers write init.vocab_size instead of init.base.vocab_size. Since v2 is the recommended struct going forward, this avoids a permanent ergonomic tax. Internally, from_init_v2() builds a temporary LlgTokenizerInit to delegate to from_init(), keeping the code DRY. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add create_tokenizer_v2() and create_byte_tokenizer_v2() that exercise LlgTokenizerInitV2 with struct_size, flat fields, and an extra EOS token. Extract run_constraint_test() helper and run the full constraint test with both v1 and v2 tokenizers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

with_eos_tokens() now asserts all token IDs are within vocab_size, preventing out-of-bounds panics during mask computation. This covers all paths (C API, Python bindings, Rust API). from_init_v2() now accepts smaller struct_size values from callers compiled against older headers. Fields beyond what struct_size covers are treated as zero/default. The minimum accepted size is the base fields through slices (matching v1 + struct_size). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Revert struct_size to strict check (require >= sizeof) since the function takes &LlgTokenizerInitV2 — Rust assumes the full struct is readable, so accepting smaller sizes would be UB. Update docs to note struct_size is reserved for future forward compatibility. from_init_v2(), before calling with_eos_tokens(). This gives C callers a graceful error instead of a panic across FFI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Change llg_new_tokenizer_v2() to take a raw pointer instead of a Rust reference. The function reads struct_size first, then copies only min(struct_size, sizeof) bytes into a local zeroed struct. This means callers compiled against an older (smaller) header genuinely work with newer library versions — new fields default to zero. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Fix TokenizerWrapper path in py_new to apply eos_token override - Add TokTrie::eos_token_set() that includes all EOS tokens - Fix LLMatcher::eos_token_set() to use all EOS tokens (was singleton) - Fix LLMatcher::consume_token_inner() to accept any EOS token - Fix Matcher::compute_mask_or_eos() to use all EOS tokens - Add Python tests for multi-EOS via TokenizerWrapper mock Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Add test_eos_token_set_single and test_eos_token_set_multiple in toktrie - Add test_multi_eos_mask_when_stopped in sample_parser (Matcher level) - Simplify Python mock test to only verify TokenizerWrapper override applies Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

riedgar-ms · 2026-03-10T15:53:15Z

Why not make this a wrapper for create_tokenizer_v2()? Isn't the code largely the same?

Under the hood, each function is using a different init struct, and consolidating would effectively remove the "living documentation" (slash test) of the old API. I think that removing the commented-out block is a good idea though.

Thoughts?

Replace raw new[]/delete[] allocations for token_lens and token_bytes with std::vector in both create_tokenizer_v2() and create_tokenizer(). This is exception-safe and avoids manual memory management. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Instead of separate tok_eos + extra_eos_tokens parameters, accept a single std::vector<uint32_t> where [0] is the primary EOS and any remaining entries are extra EOS tokens. Cleaner C++ API while still mapping naturally to the underlying C struct fields. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The v2 API now has a real working example in create_tokenizer_v2() above, so the inline commented-out snippet is redundant. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

riedgar-ms · 2026-03-12T17:18:32Z


+// Same as above but using the v2 API with an extra (unused) EOS token.
+LlgTokenizer *create_byte_tokenizer_v2(void) {
+  std::vector<std::vector<uint8_t>> tokens;


Small point, but you can also pre-allocate a capacity in the constructor if you have a pretty good idea of the size you'll need (note that size <= capacity)

riedgar-ms · 2026-03-12T17:20:33Z

  size_t offset = 0;
  for (size_t i = 0; i < tokens.size(); i++) {
-    memcpy(token_bytes + offset, tokens[i].data(), token_lens[i]);
+    memcpy(token_bytes.data() + offset, tokens[i].data(), token_lens[i]);


If you really want to go all C++ then there's std::copy() but memcpy() will work fine.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

hudson-ai · 2026-03-17T23:33:05Z

@riedgar-ms addressed the cpp stuff. Any other objections / comments? The C FFI itself looking good to you?

riedgar-ms

No objections from me

sempervictus · 2026-03-18T16:43:58Z

@hudson-ai - I'm still pushing code for the second half of my work on this to refine grammar generation and syntax. Does this change syntax from the (<[I'd]>|<[I'd]>... ) notation of the eos ids I'm deriving from the various config sections?

…ction

hudson-ai · 2026-03-18T17:15:54Z

@sempervictus no syntax changes -- your approach shouldn't stop working, BUT you'd be able to avoid adding the tokens to the grammar at all if you want. You could rewrite

   pub fn build_llg_factory(
       tokenizer: Tokenizer,
       vocab_size: Option<usize>
   ) -> Result<Arc<ParserFactory>> {
       let mut bt = ByteTokenizer::from_tokenizer(tokenizer)?;

       // NEW
       let eos_tokens = todo!();
       if !eos_tokens.is_empty() {
           bt.set_eos_tokens(eos_tokens);
       }
 
       let env = bt.into_tok_env(target_vocab)?;
       let factory = ParserFactory::new_simple(&env)?;
       Ok(Arc::new(factory))
   }

hudson-ai · 2026-03-18T19:12:58Z

@sempervictus that being said, I'd definitely stick to your approach with the EOS tokens more explicitly represented in the grammar if you need fine-grained control like "use when finishing a message but when finishing a turn" or whatever that may look like.

fellhorn · 2026-05-05T13:41:59Z

Thanks, @hudson-ai for this enhancement. It removes some of the confusion we faced when dealing with models that support multiple eos tokens. llguidance's behaviour first seemed inconsistent until we understood that only one eos token is handled automatically.
However, I’ve noticed a lingering inconsistency in how these tokens are treated across different parts of the API.

I’m curious if this behaviour is intentional or if you’d be open to accept a PR that unifies the handling.

mask.is_allowed and matcher.validate_tokens behave differently when there are multiple eos tokens.

// Masks allow all eos tokens
let mask = matcher.compute_mask_or_eos()?;
assert!(mask.is_allowed(151643));
assert!(mask.is_allowed(151645));

// but validate_tokens will only accept one:
assert_eq!(matcher.validate_tokens(&[151643]).unwrap(), 1);
assert_eq!(matcher.validate_tokens(&[151645]).unwrap(), 1); // this one is not valid

runnable test case

/// Reproduces the inconsistency reported for models with multiple EOS tokens (e.g. Qwen3 Coder):
/// `compute_mask_or_eos` allows all registered EOS tokens in the stopped mask, but
/// `validate_tokens` should also accept any of them — not just the primary one.
///
/// Requires `tokenizer.json` (a Qwen3-family tokenizer) in the workspace root.
#[test]
fn test_multi_eos_validate_tokens_consistency() {
    let tok_path = std::path::Path::new(env!("CARGO_MANIFEST_DIR")).join("../tokenizer.json");

    let tok_bytes = std::fs::read(&tok_path).unwrap();
    let mut byte_tok =
        toktrie_hf_tokenizers::ByteTokenizer::from_json_bytes(&tok_bytes).unwrap();
    // Qwen3 Coder: <|im_end|>=151645 is the chat EOS; <|endoftext|>=151643 is the base EOS.
    byte_tok.set_eos_tokens(&[151645, 151643]);
    let tok_env = byte_tok.into_tok_env(None).unwrap();

    let factory = ParserFactory::new(
        &tok_env,
        InferenceCapabilities::default(),
        &SlicedBiasComputer::general_slices(),
    )
    .unwrap();

    let grm = TopLevelGrammar::from_lark(r#"start: "hello""#.to_string());
    let mut parser = factory.create_parser(grm).unwrap();
    parser.start_without_prompt();
    let mut matcher = Matcher::new(Ok(parser));

    let hello_tokens = tok_env.tokenize("hello");
    for &tok in &hello_tokens {
        matcher.consume_token(tok).unwrap();
    }

    // Both EOS tokens must appear in the stopped mask.
    let mask = matcher.compute_mask_or_eos().unwrap();
    assert!(mask.is_allowed(151643), "primary EOS 151643 should be in stopped mask");
    assert!(mask.is_allowed(151645), "extra EOS 151645 should be in stopped mask");

    // validate_tokens must accept the primary EOS.
    assert_eq!(
        matcher.validate_tokens(&[151643]).unwrap(),
        1,
        "validate_tokens should accept primary EOS 151643"
    );

    // validate_tokens must also accept the extra EOS — consistent with the mask.
    matcher.reset().unwrap();
    for &tok in &hello_tokens {
        matcher.consume_token(tok).unwrap();
    }
    assert_eq!(
        matcher.validate_tokens(&[151645]).unwrap(),
        1,
        "validate_tokens should accept extra EOS 151645"
    );
}

validate_tokens only expects a single eos token while the mask already supports multiple eos tokens.

toktrie_hf_tokenizers only discovers one eos token automatically

The from_tokenizer function only discovers one eos token automatically. While it also detects the end of turn token for Qwen3, it does not seem to be used: The build_chat_mode_trie is not called or documented as far as I can see?
I assume this is intentional though to keep the old behaviour as a default and make multiple eos tokens rather "opt-in"?

sempervictus · 2026-05-05T14:13:28Z

In vllm.rs we extract both eos anyway so were book-ending with (<[eos0]>|<[eos1]>) special token patterns. Reason being that some eos are specifically for strings or other block types in various models while the other is really end of message terminator. This allows the model to terminate "naturally" in our tests

hudson-ai · 2026-05-05T18:20:26Z

Hey @fellhorn! Thanks for reaching out

Re: validate_tokens, I'd say that's a bug. Would absolutely take a PR on that one!
Re: automatic detection, the current behavior with multiple EOS tokens being treated as opt-in is intentional :)

fellhorn · 2026-05-06T17:32:53Z

Re: validate_tokens, I'd say that's a bug. Would absolutely take a PR on that one!

Thanks for your reply. Please find my fix in #347.

`validate_tokens` in `parser/src/earley/parser.rs` only recognized the primary EOS token, which was overlooked in #305 when adding support for multiple EOS tokens. This PR aligns the behaviour with e.g. `compute_mask` that now allows multiple eos tokens. Signed-off-by: Dennis Keck <26092524+fellhorn@users.noreply.github.com>

hudson-ai requested review from Copilot and riedgar-ms March 6, 2026 01:31

Copilot started reviewing on behalf of hudson-ai March 6, 2026 01:31 View session

cargo fmt

b261ae9

Copilot AI reviewed Mar 6, 2026

View reviewed changes

Comment thread python_ext/src/py.rs Outdated

Comment thread python/llguidance/llamacpp.py Outdated

Comment thread parser/src/ffi.rs

Comment thread parser/llguidance.h

hudson-ai and others added 2 commits March 5, 2026 18:14

Apply suggestions from code review

985e344

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Add zero-initialization requirement comment to LlgTokenizerInit

abed676

Document that LlgTokenizerInit must be zero-initialized before setting fields, as new fields may be appended in future versions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

hudson-ai and others added 5 commits March 6, 2026 10:53

cargo fmt

bc534b3

hudson-ai requested a review from Copilot March 6, 2026 19:19

Copilot started reviewing on behalf of hudson-ai March 6, 2026 19:19 View session

Copilot AI reviewed Mar 6, 2026

View reviewed changes

Comment thread parser/src/ffi.rs Outdated

Comment thread python_ext/src/py.rs

Comment thread python_ext/src/py.rs

Comment thread python_ext/src/llamatokenizer.rs

Comment thread c_sample/c_sample.cpp Outdated

Comment thread toktrie/src/toktree.rs

Comment thread parser/src/ffi.rs Outdated

hudson-ai and others added 4 commits March 6, 2026 11:49

cargo fmt

a7292ef

hudson-ai requested a review from Copilot March 6, 2026 20:19

Copilot started reviewing on behalf of hudson-ai March 6, 2026 20:20 View session

Copilot AI reviewed Mar 6, 2026

View reviewed changes

Comment thread python_ext/src/py.rs

Comment thread python_ext/src/py.rs

Comment thread python_ext/src/llamatokenizer.rs

Comment thread python_ext/src/py.rs

hudson-ai and others added 3 commits March 6, 2026 13:49

cargo fmt

821f593

riedgar-ms reviewed Mar 10, 2026

View reviewed changes

Comment thread c_sample/c_sample.cpp Outdated

riedgar-ms reviewed Mar 10, 2026

View reviewed changes

Comment thread c_sample/c_sample.cpp

hudson-ai and others added 5 commits March 10, 2026 09:16

Merge branch 'main' into multi_eos

911534d

clean up python tests a little bit

afa0241

Remove stale commented-out v2 snippet from create_tokenizer

724856c

The v2 API now has a real working example in create_tokenizer_v2() above, so the inline commented-out snippet is redundant. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

riedgar-ms reviewed Mar 12, 2026

View reviewed changes

hudson-ai and others added 3 commits March 17, 2026 16:29

Pre-allocate token vector capacity in byte tokenizer constructors

cdfbf2d

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Use std::copy instead of memcpy for token byte packing

6ee6b94

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Replace remaining memcpy with std::copy in tokenize_callback

e941074

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

riedgar-ms approved these changes Mar 18, 2026

View reviewed changes

hudson-ai added 3 commits March 18, 2026 09:23

doctest format fixes

434d169

Merge branch 'main' into multi_eos

91e022d

cbindgen

ce51ebe

hudson-ai added 2 commits March 18, 2026 09:56

simplify from_init/from_init_v2 by delegating in a more sensible dire…

cc619d0

…ction

cargo fmt

5cfc057

hudson-ai merged commit 2025302 into guidance-ai:main Mar 18, 2026
14 checks passed

fellhorn mentioned this pull request May 6, 2026

Fix validate_tokens for multiple EOS tokens #347

Merged

Conversation

hudson-ai commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Changes

Usage

API compatibility

Known limitations

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sempervictus commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hudson-ai commented Mar 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

riedgar-ms Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

hudson-ai Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

riedgar-ms Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

riedgar-ms Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

hudson-ai commented Mar 17, 2026

Uh oh!

riedgar-ms left a comment

Choose a reason for hiding this comment

Uh oh!

sempervictus commented Mar 18, 2026

Uh oh!

hudson-ai commented Mar 18, 2026

Uh oh!

Uh oh!

hudson-ai commented Mar 18, 2026

Uh oh!

fellhorn commented May 5, 2026

Uh oh!

sempervictus commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hudson-ai commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fellhorn commented May 6, 2026

Uh oh!

Reviewers

hudson-ai commented Mar 6, 2026 •

edited

Loading

sempervictus commented Mar 6, 2026 •

edited

Loading

riedgar-ms Mar 12, 2026 •

edited

Loading

sempervictus commented May 5, 2026 •

edited

Loading

hudson-ai commented May 5, 2026 •

edited

Loading