Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR brings the feature of creating more deterministic OpenAI text completions based on a prompt
String
and aRegex
. Right now only available for theKotlin/JVM
target. It's inspired on Matt Rickard's ReLLM python library.It works iterating to create the final completion, by creating a
logitBias
map by filtering all possible tokens that match partially the Regex and sending to OpenAI the original prompt plus a partial completion for every step. Each partial completion has a maximum size ofmaxTokens=1
.Each partially matching token in
logitBias
gets a value of100
, telling the model that it's one of the exclusive tokens to choose in the completion. Take into account, that by the day of this PR,logitBias
is limited by OpenAI to 300 tokens, therefore we can only send at most that size.In the function there's also an optional limit for the maximum number of tokens generated, with a default value of
maxNewTokens = 30
, and a flag in case we want to stop at the full match of theRegex
, with default value ofstopAfterMatch = true
.Example of usage:
Be mindful that for very complex prompts or regexes, this can result on heavy OpenAI usages.
It would be interesting to explore this approach with more complex ways of matching those tokens (like grammars), working with local models with no
logitBias
size or API limitations, or incorporating otherlogitBias
strategies to other functions of the DSL.cc/ @xebia-functional/team-ai