This repository contains the implementation of Constrained Decoding of Diffusion LLMs with Context-Free Grammars, including techniques for multi-region constrained generation. Our method guarantees syntactic correctness while improving functional correctness by up to 7%.
We present the first generalized method for constrained decoding of multi-region infilling and out-of-order generation models. Our approach:
- Works with SOTA diffusion LLMs like LLaDA, Dream-Coder and DiffuCoder for non-autoregressive generation
- Also works for Fill-in-the-Middle (FIM) and Multi-Region Infilling (MRI) models like StarCoder, DeepSeek Coder, and CodeGemma
- Supports multiple constraint languages through context-free grammars (examples provided are JSON Schema, C++, and SMILES)
- Guarantees syntactic correctness wrt. the grammar
- Improves functional correctness by up to 7% with minimal computational overhead
We recommend using a virtual environment to avoid conflicts with other Python packages.
- Clone the repository and set up virtual enviroment:
git clone https://github.com/eth-sri/constrained-diffusion.git
cd constrained-diffusion
python3 -m venv venv
source venv/bin/activate- Build and install Rust bindings:
cd rustformlang_bindings
pip install maturin
maturin build --release
pip install .
cd ..- Install the main package:
pip install -e .- Verify installation:
pytest testsCheck out example.py for a complete example of how to use the constrained decoding mechanism.
In general, you want to first load a model and then load a constraint language, such as C++ or JSON Schema. The example below shows abbreviated code on how to use the GSAI-ML/LLaDA-8B-Instruct model with a C++ constraint.
Replace the model name with any diffusion LLM of your choice, such as apple/DiffuCoder-7B-Instruct.
python3 example.pyThis is a visualization of our constrained decoding mechanism on output similar to that created by LLaDA 7b.
βββ constrained_diffusion/ # Main package
β βββ constrain_utils.py # Constraint generation utilities
β βββ cfgs/ # Context-free grammar definitions
β βββ eval/ # Evaluation frameworks
β βββ dllm/ # Evaluation framework for DLLMs
β βββ mri/ # Evaluation framework for Multi-Region Infilling
βββ rustformlang/ # Rust formal language library
βββ rustformlang_bindings/ # Python bindings for Rust library
βββ eval/ # Evaluation scripts and results
β βββ dllm/ # DLLM task evaluations
β βββ mri/ # Multi-Region infilling evaluations
β βββ figures/ # Result visualization
βββ benchmark_generation/ # Benchmark generation tools
βββ docs/ # Project website
We run MRI and diffusion LLMs on the following datasets:
| Dataset | Setting | Description | Download |
|---|---|---|---|
| C++ | MRI | C++ code generation tasks with multi-region infilling | π€ HuggingFace |
| C++ | DLM | C++ code generation tasks with diffusion LLMs | π€ HuggingFace |
| JSON | DLM | Data extraction, following a JSON Schema | π€ HuggingFace |
| SMILES | DLM | Chemical compound representation in SMILES | π€ HuggingFace |
You can download the results of our evaluation using the following link: Download Results. Unzip the file in the
results/directory to access the evaluation results.
For the MRI models, we provide an execution harness for the C++ HumanEval multi-region dataset. To execute task 11 on the 1-region dataset with constraints and traces enabled, use the following command:
python3 -m constrained_diffusion.eval.mri.generic_inference \
--max-tokens 256 \
--model_name deepseek-ai/deepseek-coder-6.7b-base \
--seed 0 \
--temp 1 \
--dataset-name HumanEval/MRI/cpp/1 \
--constrained True \
--trace True \
--task_id /11_ For the diffusion LLMs, use the following command for the JSON dataset.
python3 -m constrained_diffusion.eval.dllm.generic_inference \
--max-tokens 256 \
--model_name apple/DiffuCoder-7B-Instruct \
--seed 0 \
--temp 0.2 \
--dataset-name jsonschema \
--steps 32 \
--constrained True \
--trace True \
--task_id _37A general orchestration script for all experiments in the main paper is provided in eval/mri/run_mri.py and eval/dllm/run_dllm.py.
The results are stored in the results/ directory, with each configuration's results in a separate file.
Evaluation of result correctness is decoupled from the inference step. The following assumes that the inference step above was executed correctly and results lie in results.
Note: For SMILES evaluation, you need to install
rdkitandpartialsmiles:pip install rdkit partialsmiles
Make sure to have sufficient memory and CPU cores available, as the evaluation scripts can be memory-intensive.
# Evaluate all files in the results folder
bash eval/check_all_individually.sh results/*You can find more details on the evaluation scripts, for example on how to reproduce the figures from the paper, in the README in the eval/ directory: README.
We welcome contributions! When contributing, please make sure to activate pre-commit hooks to ensure code quality and consistency. You can install pre-commit hooks with:
pip install pre-commit
pre-commit install- Define the grammar in
constrained_outoforder/cfgs/ - Implement lexical mapping in
check_lex_map.py - Add tests in
tests/test_cfgs/ - Update documentation
- Create a new constraint language
- Implement a dataset in
constrained_outoforder/eval/[dllm|mri]/datasets/your_task.py - Register the dataset using
register_dataset() - Add evaluation logic in
eval/[dllm|mri]/your_task/checker.py
- Implement the model in
constrained_outoforder/eval/[dllm|mri]/models/your_model.py - Register the model using
register_model()
This project is licensed under the MIT License - see the LICENSE file for details.
- Paper: arXiv:2508.10111
- Project Website: Constrained Decoding Paper Website + Demo
- Rustformlang README: Rustformlang Docs
If you use this work in your research, please cite:
@article{mundler2025constraineddiffusion,
title={Constrained Decoding of Diffusion LLMs with Context-Free Grammars},
author={Niels MΓΌndler and Jasper Dekoninck and Martin Vechev},
year={2025},
eprint={2508.10111},
archivePrefix={arXiv},
url={https://arxiv.org/abs/2508.10111}
}This work was done by the Secure, Reliable and Intelligent Systems Lab at ETH Zurich.
