Outlines v0.1.0
⚡ Performance Improvements
- Outlines Core: Enjoy faster FSM index construction with a new implementation (#1175).
- 98% Reduction in Runtime Overhead: Reduced overhead by storing FSM-token-mask as tensors. (#1013)
🚀 New Features
- Transformers Vision Models: Apply structured generation with vision + text inputs (#1052)
- OpenAI-Compatible API Support: Use
models.openai
with any OpenAI-like API (e.g. vLLM, ChatGPT), including structured generation withgenerate.json
andgenerate.choice
(#1142).
💡 Enhancements
- Unified Logits Processors: All models now use shared
outlines.processors
, completed by adding the following to the integration: llama-cpp, vLLM and ExLlamaV2). - Custom Regex Parsers: Simplify the implementation of custom Guide classes with Regex Parser support (#1039).
- Qwen-style Byte Tokenizer Support: Now compatible with Qwen-style byte tokenizers (#1153).
🐛 Bug Fixes
- CFG Beta: Fixed large number of bugs to enable beta version grammar-based generation using Lark (#1067)
- Fixed incorrect argument order breaking some models in
models.transformers_vision
(#1077). - Resolved OpenAI fallback tokenizer issue (#1046).
- Option to disable tqdm bars during inference with vLLM (#1004).
models.llamacpp
no longer includes implicitmax_tokens
(#996).- Fixed whitespace handling for
models.mlxlm
(#1003). models.mamba
now working, and supporting structured generation (#1040).- Resolved
pad_token_id
reset issue inTransformerTokenizer
(#1068). - Fixed
outlines.generate
generator reuse causing runtime errors (#1160).
⚠️ Breaking Changes
outlines.integrations
is now deprecated: #1061
Full Changeset
- Add contributors, creation and update date by @rlouf in #1000
- Add SimToM prompt recipe by @kstathou in #1002
- Ensure
models.llamacpp
Doesn't Have Implicitmax_tokens
by @lapp0 in #996 - fix
models.mlxlm
whitespace prefix handling by @lapp0 in #1003 - Adding the option to avoid displaying tqdm bars at inference with
vllm
by @BIMM99 in #1004 - Update
pyproject.toml
, enablemlx-lm
requirement only on darwin, disablevllm
requirement on darwin by @lapp0 in #1005 - Add asknews example by @robcaulk in #1008
- Improve
outlines.processors
, add integration tests to test_generate.py by @lapp0 in #998 - Abridged version of the .txt article on Coding For Structured Generat… by @willkurt in #1012
RegexFSM
: Cache Legal-Token Mask astorch.Tensor
to Improve Performance by @lapp0 in #1013- JSON response format fix by @rshah713 in #1029
- Fix broken link in README.md regarding Serving with vLLM by @shinohara-rin in #1030
- Major bug & fix: Fix bug in batched multi sample generation by @JulesGM in #1025
- Update file contributors style for docs by @gtsiolis in #1034
- fix: ocd missing blank space by @talboren in #1036
- Adds support for custom regex parsers (for multimodal structured generation) by @leloykun in #1039
- Update
models.transformers
to useSequenceGeneratorAdapter
andOutlinesLogitsProcessors
by @lapp0 in #966 - Use
outlines.processors
formodels.llamacpp
by @lapp0 in #997 - Add QA with Citations example to the Cookbook by @alonsosilvaallende in #1042
- Fix mamba integration by making it a variant of
outlines.models.transformers
by @lapp0 in #1040 - fix PyPI name for autoawq by @davanstrien in #1045
- add fallback tokenizer by @JerryKwan in #1046
- Update cerebrium instructions by @milo157 in #1047
- Add Knowledge Graph Extraction example to the Cookbook by @alonsosilvaallende in #1049
- Correct link and add llama-cpp-python installation instructions by @alonsosilvaallende in #1051
- Introduce
outlines.models.transformers_vision
by @lapp0 in #1052 - Use
outlines.processors
andSequenceGeneratorAdapter
foroutlines.models.vllm
by @lapp0 in #1053 - Modal documentation: fix deprecated memory= parameter by @Perdu in #1057
- Documentation: Fix failing Modal example by @Perdu in #1058
- Add links to the two examples by @alonsosilvaallende in #1062
- Add link to Multimodal Structured Generation (MMSG) library to docs by @leloykun in #1064
- Correct the documentation to disable caching by @alonsosilvaallende in #1069
- Fix TransformerTokenizer pad_token_id reset by @ispobock in #1068
- Fix link to
mamba
model reference by @rlouf in #1072 - More detailed mlxlm documentation by @lapp0 in #1074
- Add chain of thought example by @alonsosilvaallende in #1087
- Fix coverage
exclude_lines
setting for ellipsis by @brandonwillard in #1089 - Add ReAct agent example by @alonsosilvaallende in #1090
- Include SPDX license info in project metadata by @tiran in #1094
- Update Documentation by @lapp0 in #1063
- Make
model_class
required arg, defaultprocessor_class
toAutoProcessor
by @parkervg in #1077 - Fix details in the documentation by @rlouf in #1096
- Change cookbook examples: Download model weights in the hub cache folder by @alonsosilvaallende in #1097
- Correct variable name in chain-of-thought example by @cpfiffer in #1101
- Remove deprecated
outlines.integrations
by @rlouf in #1061 - Use relative coverage source paths by @brandonwillard in #1113
- Add missing CI matrix step by @brandonwillard in #1124
- Update modal example by @cpfiffer in #1111
- docs: fix typo by @billmetangmo in #1120
- Pass
text
andimages
as kwargs to VLM processor by @lapp0 in #1126 - Update
CFGGuide
to useoutlines.fsm.parsing
. Enablegenerate.cfg
by @lapp0 in #1067 - Include hidden files in coverage CI upload by @brandonwillard in #1136
- Add documentation request issue template by @cpfiffer in #1138
- Set the tokenizer versions in
test_create_fsm_index_tokenizer
by @brandonwillard in #1139 - Change Outlines' logo by @rlouf in #1143
- Update logo size in documentation by @rlouf in #1144
- Improve sampler docs by @cpfiffer in #1141
- Update vllm.md by @americanthinker in #1137
- Correct pathways, update site color, front page fixes by @cpfiffer in #1146
- Change the color of the logo by @rlouf in #1155
- Remove Broken pyairports Package, Replace with airportsdata by @lapp0 in #1156
- Enable Tokenizers with Byte Tokens by @lapp0 in #1153
- Integrate OpenAI API Structured Generation by @lapp0 in #1142
- Remove link to Outlines twitter account by @rlouf in #1168
- Don't re-use logits processors in SequenceGeneratorAdapter, copy them by @lapp0 in #1160
- Fix benchmark workflow triggers by @brandonwillard in #1170
- Reuse jinja environment for a prompt by @jantrienes in #1162
- Use Faster FSM by @lapp0 in #1175
- Pin
outlines-core
version by @brandonwillard in #1187 - add missing comma in llamacpp docs by @cpfiffer in #1190
- Minor Changes on Top of Isamus EXL2 PR (Docs, Enable FSM/CFG, Fix Tokenizer) by @lapp0 in #1191
- Version the Documentation by @lapp0 in #1059
New Contributors
- @BIMM99 made their first contribution in #1004
- @robcaulk made their first contribution in #1008
- @willkurt made their first contribution in #1012
- @rshah713 made their first contribution in #1029
- @shinohara-rin made their first contribution in #1030
- @JulesGM made their first contribution in #1025
- @gtsiolis made their first contribution in #1034
- @talboren made their first contribution in #1036
- @JerryKwan made their first contribution in #1046
- @milo157 made their first contribution in #1047
- @Perdu made their first contribution in #1057
- @ispobock made their first contribution in #1068
- @tiran made their first contribution in #1094
- @cpfiffer made their first contribution in #1101
- @billmetangmo made their first contribution in #1120
- @americanthinker made their first contribution in #1137
- @jantrienes made their first contribution in #1162
Full Changelog: 0.0.46...0.1.0