feat(llama.cpp): Add support to grammar triggers #4733
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Thanks to the great upstream work in llama.cpp ( ggml-org/llama.cpp#9639 thanks 🫶 @ochafik ! ) it is now possible to enable grammars with a set of trigger words. This allows to force a special schema in answers only if a trigger word is detected as part of the text of the LLM. This, for example, allows to models that uses tags before returning the json (e.g.
[TOOL_CALL]
,<tool_call>
, ... ) to identify the function call and enable the grammar - which allows the model to output text replies or JSON.This PR enables this feature in LocalAI, and it can be enabled in the model configuration file:
Notes for Reviewers
Signed commits