grammar: increase MAX_REPETITION_THRESHOLD + make it configurable via envvar#21003
grammar: increase MAX_REPETITION_THRESHOLD + make it configurable via envvar#21003pwilkin wants to merge 2 commits into
Conversation
|
@CISC @ngxson or @ggerganov maybe care to help? Need 1 more approval :) |
|
Fixes #20867 |
|
Should we wait to see if #21216 fixes the issue? AFAIU, if it works, we won't have to adjust the threshold. |
No, people have requested the restriction be modifiable even before the explosion of OpenClaw models because they have some custom grammars that require lots of repetitions. |
|
I think it's important we understand why it's exploding in the first place. Then we can make an informed decision. Anyway, I fixed it in #21216. |
For the exploding stuff, yes, but people have called for this to be configurable way before the exploding stuff happened, I just didn't get to it. Some people have grammars that legitimately need more than 2k repetitions. |
|
Data point in favor: this throw is also hit by hand-authored GBNF, not just JSON-Schema-derived grammars. We hit it writing rules like Even with the autoparser root-cause fixes from #21216 in place, consumers writing their own grammars still bump into the cap whenever a single field needs ≥2000 chars (long extracted quotes, summary paragraphs, structured report bodies, etc.). The current workaround — splitting one field into a list of bounded chunks — works but adds a render step and isn't obvious until you've debugged it. Either raising the threshold or surfacing it as a configurable knob would unblock that without affecting the chat-parser rationale that motivated the original cap. Would also help if the parse-time throw were surfaced in the response body (currently swallowed → looks identical to "model ignored grammar"), but that's a separate change. |
Overview
For very big tool calling environments (like OpenClaw) the current limit is insufficient. Even a bigger limit might be insufficient, so on top of increasing it I'm making it configurable.
Additional information
Together with #20961 should help with #20879
Requirements