Commit ee680ff
committed
[Serving][Grammar] Refactor GrammarStateMatcher and support LLaMA-3
This PR refactors GrammarStateMatcher and support the LLaMA-3 tokenizer.
Common tokenizers, including Phi-2, Gemma, LLaMA-2, etc. are also
supported.
The performance is optimized for LLaMA-3 tokenizer since its token table
has size 128k, much larger than LLaMA-2 tokenizer.
These changes are introduced to the grammar library:
These changes are introduced to the grammar library:
1. Introduce ByteString rule expression and simplify CharacterClass
and CharacterClassStar
2. Refactor BNFGrammarVisitor and BNFGrammarMutator for visiting and
mutating grammar rules
3. Now GrammarStateMatcherBase, the internally impl of the
GrammarStateMatcher, accepts char by char, instead of codepoint by
codepoint. So it supports any valid UTF-8 string, even if the token
is not a complete codepoint.
4. Support lookahead assertion for rules to specify the rule must be
followed by a sequence. This can eliminate some uncertain tokens
in preprocessing.
Minor changes:
1. Introduce template hash function HashCombine
2. Update the UTF8 encoding handling functions
Performance:
1. For JSON, finding mask requires <30us on 5900X with single thread.
The uncertain tokens is <30 in most cases.
2. For JSON schema, finding mask requires <30us on 5900X with single
thread. The uncertain tokens is <30 in most cases.1 parent 679d3a8 commit ee680ff
File tree
27 files changed
+1684
-1024
lines changed- cpp
- serve
- grammar
- support
- python/mlc_llm/serve
- tests/python/serve
- web/emcc
27 files changed
+1684
-1024
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
122 | 122 | | |
123 | 123 | | |
124 | 124 | | |
125 | | - | |
| 125 | + | |
126 | 126 | | |
127 | 127 | | |
128 | 128 | | |
| |||
499 | 499 | | |
500 | 500 | | |
501 | 501 | | |
502 | | - | |
| 502 | + | |
503 | 503 | | |
504 | | - | |
| 504 | + | |
505 | 505 | | |
506 | 506 | | |
507 | 507 | | |
| |||
513 | 513 | | |
514 | 514 | | |
515 | 515 | | |
516 | | - | |
| 516 | + | |
517 | 517 | | |
518 | 518 | | |
519 | 519 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | | - | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
25 | | - | |
| 24 | + | |
| 25 | + | |
26 | 26 | | |
27 | | - | |
28 | | - | |
29 | | - | |
| 27 | + | |
| 28 | + | |
30 | 29 | | |
31 | 30 | | |
32 | 31 | | |
33 | 32 | | |
34 | | - | |
35 | | - | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
36 | 46 | | |
37 | 47 | | |
38 | 48 | | |
| |||
69 | 79 | | |
70 | 80 | | |
71 | 81 | | |
| 82 | + | |
72 | 83 | | |
73 | 84 | | |
74 | | - | |
75 | | - | |
| 85 | + | |
| 86 | + | |
76 | 87 | | |
77 | | - | |
78 | | - | |
79 | | - | |
80 | | - | |
81 | | - | |
82 | | - | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
83 | 93 | | |
84 | | - | |
| 94 | + | |
85 | 95 | | |
86 | 96 | | |
87 | 97 | | |
88 | | - | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
101 | 109 | | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
112 | 120 | | |
113 | 121 | | |
114 | 122 | | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
120 | 128 | | |
121 | | - | |
| 129 | + | |
122 | 130 | | |
123 | 131 | | |
124 | 132 | | |
125 | 133 | | |
126 | 134 | | |
127 | 135 | | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | | - | |
| 136 | + | |
132 | 137 | | |
133 | | - | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
134 | 158 | | |
135 | | - | |
136 | | - | |
137 | | - | |
| 159 | + | |
| 160 | + | |
138 | 161 | | |
139 | | - | |
140 | 162 | | |
141 | 163 | | |
142 | 164 | | |
143 | | - | |
144 | | - | |
| 165 | + | |
145 | 166 | | |
146 | 167 | | |
147 | 168 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| 47 | + | |
47 | 48 | | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
56 | | - | |
57 | 56 | | |
58 | 57 | | |
59 | 58 | | |
| |||
76 | 75 | | |
77 | 76 | | |
78 | 77 | | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
79 | 81 | | |
80 | 82 | | |
81 | 83 | | |
| |||
86 | 88 | | |
87 | 89 | | |
88 | 90 | | |
| 91 | + | |
| 92 | + | |
89 | 93 | | |
90 | 94 | | |
91 | 95 | | |
| |||
95 | 99 | | |
96 | 100 | | |
97 | 101 | | |
98 | | - | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
99 | 105 | | |
100 | | - | |
101 | | - | |
| 106 | + | |
102 | 107 | | |
103 | 108 | | |
104 | 109 | | |
| |||
107 | 112 | | |
108 | 113 | | |
109 | 114 | | |
110 | | - | |
111 | | - | |
112 | 115 | | |
113 | 116 | | |
114 | 117 | | |
| |||
154 | 157 | | |
155 | 158 | | |
156 | 159 | | |
157 | | - | |
158 | | - | |
| 160 | + | |
| 161 | + | |
159 | 162 | | |
160 | 163 | | |
161 | 164 | | |
| |||
168 | 171 | | |
169 | 172 | | |
170 | 173 | | |
171 | | - | |
172 | | - | |
| 174 | + | |
| 175 | + | |
173 | 176 | | |
174 | 177 | | |
175 | | - | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
180 | | - | |
181 | | - | |
182 | | - | |
183 | | - | |
184 | | - | |
185 | | - | |
186 | 178 | | |
187 | 179 | | |
188 | | - | |
189 | | - | |
| 180 | + | |
190 | 181 | | |
191 | 182 | | |
192 | 183 | | |
| |||
0 commit comments