-
Notifications
You must be signed in to change notification settings - Fork 38
Closed
Description
Thanks for the crate! We are currently adding support for GLM4 models in mistral.rs. However, we found that toktrie_hf_tokenizers is not compatible with the GLM4 tokenizer. This causes a crash due to the following assertion failure in the TokTrie crate:
fn serialize(&mut self, data: &mut Vec<TrieNode>, num_parents: u8) {
let idx = data.len();
let mut num_ch = self.children.len();
data.push(TrieNode::new(self.byte, self.token_id, num_parents));
//self.children.reverse();
self.children.sort_by_key(|e| e.byte);
for entry in &mut self.children {
num_ch -= 1;
assert!(num_parents < 0xff);
entry.serialize(data, if num_ch == 0 { num_parents + 1 } else { 1 });
}
let subtree_size = data.len() - idx;
assert!(subtree_size < 0x100_0000);
data[idx].bits2 |= (subtree_size as u32) << 8;
}Here is the use case in Mistral.rs:
EricLBuehler
Metadata
Metadata
Assignees
Labels
No labels