-
Notifications
You must be signed in to change notification settings - Fork 541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama_cpp - JSON fails to generate when using Pydantic model with models.llama_cpp #1110
Comments
This fix the problem. (Or at least in my case) For those who would like to test in advance:
|
Thanks for directing people to that branch! Still a work in progress, only a subset of the failure cases are handled right now. Happy to hear more json failure reports to help me ensure I address all problems! |
@lapp0 since yesterday I've noticed other errors despite the fix already commited. Whenever I come across an error, I'll post it here. |
Traceback (most recent call last):
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/pydantic/main.py", line 1160, in parse_raw
obj = parse.load_str_bytes(
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/pydantic/deprecated/parse.py", line 49, in load_str_bytes
return json_loads(b) # type: ignore
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 3328 (char 3327)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mnt/home/pierre/llmdoc/outlines_server/outlines_server/DocumentClassifier.py", line 131, in <module>
File "/mnt/home/pierre/llmdoc/outlines_server/outlines_server/DocumentClassifier.py", line 92, in classify
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/outlines/generate/api.py", line 511, in __call__
return self._format(completions)
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/outlines/generate/api.py", line 487, in _format
return self.format_sequence(sequences)
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/outlines/generate/json.py", line 50, in <lambda>
generator.format_sequence = lambda x: schema_object.parse_raw(x)
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/pydantic/main.py", line 1187, in parse_raw
raise pydantic_core.ValidationError.from_exception_data(cls.__name__, [error])
pydantic_core._pydantic_core.ValidationError: 1 validation error for DocumentClassificationResult
__root__
Expecting ',' delimiter: line 1 column 3328 (char 3327) [type=value_error.jsondecode, input_value='{ "label": "tax_notice",...99999999999999999999999', input_type=str] |
Anything I can help with here? Currently getting bit by this issue, using the dev branch. import outlines
import llama_cpp
# Load the model
model = outlines.models.llamacpp(
"NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF",
"Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf",
tokenizer=llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained(
"NousResearch/Hermes-2-Pro-Llama-3-8B"
),
n_gpu_layers=-1,
n_ctx=8192,
verbose=False
)
# Specify temperature in sampler
sampler = outlines.samplers.multinomial(samples=100)
# Define the coinflip choice
coinflip_regex_pattern = r"[H|T]"
generator = outlines.generate.choice(
model,
["H", "T"],
sampler=sampler
)
output = generator("Flip a coin, respond with H or T: ")
print(output)
# Count the occurrences of each outcome
heads_count = output.count("H")
tails_count = output.count("T")
print(f"Heads: {heads_count}, Tails: {tails_count}") Versions:
```python
3-1==1.0.0
absl-py==2.1.0
accelerate==0.34.2
acme==2.11.0
aiofiles==23.2.1
aiohappyeyeballs==2.4.0
aiohttp==3.10.5
aiolimiter==1.1.0
aiosignal==1.3.1
aiostream==0.5.2
altair==5.3.0
annotated-types==0.7.0
anyio==3.7.1
appdirs==1.4.4
argcomplete==3.4.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
args==0.1.0
arrow==1.3.0
asgiref==3.8.1
asttokens==2.4.1
astunparse==1.6.3
async-lru==2.0.4
async-timeout==4.0.3
atom==0.10.5
atomicwrites==1.4.0
attrs==24.2.0
azure-core==1.30.2
azure-identity==1.17.1
azure-storage-blob==12.21.0
b2sdk==2.4.1
Babel==2.15.0
backcall==0.2.0
backoff==2.2.1
backports.tarfile==1.2.0
bcc==0.18.0
bcrypt==4.2.0
beartype==0.18.5
beautifulsoup4==4.12.3
bitsandbytes==0.43.3
black==24.4.2
bleach==6.1.0
blinker==1.8.2
boto3==1.34.149
botocore==1.34.149
boxsdk==3.11.0
Brlapi==0.8.3
bs4==0.0.2
build==1.2.1
cachetools==5.4.0
cairocffi==1.7.1
CairoSVG==2.7.1
certbot==2.11.0
certbot-nginx==2.11.0
certifi==2024.8.30
cffi==1.16.0
chardet==3.0.4
charset-normalizer==3.3.2
chroma-hnswlib==0.7.6
chromadb==0.5.5
chrome-gnome-shell==0.0.0
click==8.1.7
clint==0.5.1
cloudpickle==3.0.0
cmake==3.30.1
cmdstanpy==1.2.4
colorama==0.4.6
coloredlogs==15.0.1
comm==0.2.2
command-not-found==0.3
ConfigArgParse==1.7
configobj==5.0.8
contourpy==1.2.1
coverage==6.2
cryptography==3.4.8
cssselect2==0.7.0
cupshelpers==1.0
cycler==0.12.1
dataclasses-json==0.6.7
datasets==2.21.0
dbus-python==1.2.18
debtcollector==3.0.0
debugpy==1.8.2
decorator==5.1.1
defer==1.0.6
defusedxml==0.7.1
demjson3==3.0.6
Deprecated==1.2.14
deprecation==2.1.0
dill==0.3.8
dirtyjson==1.0.8
discodos==1.0rc2
diskcache==5.6.3
distlib==0.3.8
distro==1.9.0
dm-tree==0.1.8
dnspython==2.6.1
docker==7.1.0
docker-compose==1.29.2
docker-pycreds==0.4.0
dockerpty==0.4.1
docopt==0.6.2
docstring_parser==0.16
docx2txt==0.8
dropbox==12.0.2
duplicity==3.0.0
ecdsa==0.19.0
EditorConfig==0.12.4
einops==0.8.0
email_validator==2.2.0
exceptiongroup==1.2.2
executing==2.0.1
fail2ban==0.11.2
fairscale==0.4.13
faiss-cpu==1.8.0.post1
fastapi==0.111.1
fastapi-cli==0.0.4
fasteners==0.19
fastjsonschema==2.20.0
fasttext==0.9.3
ffmpy==0.3.2
filelock==3.16.0
fire==0.6.0
FLAML==2.1.2
flatbuffers==24.3.25
fonttools==4.53.1
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2024.6.1
future==1.0.0
gast==0.6.0
gdata-python3==3.0.1
gguf==0.9.1
ghp-import==2.1.0
gitdb==4.0.11
GitPython==3.1.43
google-api-core==2.19.1
google-api-python-client==2.138.0
google-auth==2.32.0
google-auth-httplib2==0.2.0
google-auth-oauthlib==1.2.1
google-pasta==0.2.0
googleapis-common-protos==1.63.2
gotrue==2.7.0
gradio==4.39.0
gradio_client==1.1.1
greenlet==3.0.3
griffe==1.2.0
grpcio==1.65.1
grpclib==0.4.7
h11==0.14.0
h2==4.1.0
h5py==3.11.0
hidpidaemon==18.4.6
hpack==4.0.0
html2text==2024.2.26
html5lib==1.1
httpcore==1.0.5
httpie==3.2.3
httplib2==0.22.0
httptools==0.6.1
httpx==0.27.0
huggingface-hub==0.24.6
humanfriendly==10.0
humanize==4.10.0
hyperframe==6.0.1
icontract==2.6.6
idna==3.8
importlib_metadata==8.0.0
importlib_resources==6.4.0
iniconfig==2.0.0
interegular==0.3.3
ip-associations-python-novaclient-ext==0.2
ipykernel==6.29.5
ipython==8.26.0
ipython-genutils==0.2.0
ipywidgets==8.1.3
iso8601==2.1.0
isodate==0.6.1
isoduration==20.11.0
jaraco.classes==3.4.0
jaraco.context==5.3.0
jaraco.functools==4.0.1
jax==0.4.30
jaxlib==0.4.30
jedi==0.19.1
jeepney==0.8.0
Jinja2==3.1.4
jmespath==1.0.1
joblib==1.4.2
josepy==1.14.0
jottalib==0.5.1
jsbeautifier==1.15.1
json5==0.9.25
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.10.0
jupyter-lsp==2.2.5
jupyter_client==8.6.2
jupyter_core==5.7.2
jupyter_server==2.14.2
jupyter_server_terminals==0.5.3
jupyterlab==4.2.4
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.3
jupyterlab_widgets==3.0.11
keras==3.4.1
kernelstub==3.1.4
keyring==25.2.1
keystoneauth1==2.18.0
kiwisolver==1.4.5
kubernetes==30.1.0
lancedb==0.11.0
language-selector==0.1
lark==1.2.2
launchpadlib==2.0.0
lazr.restfulclient==0.14.6
lazr.uri==1.0.6
lib==4.0.0
libclang==18.1.1
lit==18.1.8
livereload==2.7.0
llama-cloud==0.0.11
llama-index==0.10.58
llama-index-agent-openai==0.2.9
llama-index-cli==0.1.13
llama-index-core==0.10.58
llama-index-embeddings-azure-openai==0.1.11
llama-index-embeddings-openai==0.1.11
llama-index-indices-managed-llama-cloud==0.2.7
llama-index-legacy==0.9.48
llama-index-llms-azure-openai==0.1.10
llama-index-llms-openai==0.1.27
llama-index-multi-modal-llms-openai==0.1.8
llama-index-program-openai==0.1.7
llama-index-question-gen-openai==0.1.3
llama-index-readers-file==0.1.31
llama-index-readers-llama-parse==0.1.6
llama-parse==0.4.9
llama_cpp_python==0.2.90
llamaindex-py-client==0.1.19
llvmlite==0.43.0
lm-format-enforcer==0.10.6
lockfile==0.12.2
logfury==1.0.1
louis==3.20.0
lxml==5.2.2
macaroonbakery==1.3.4
Markdown==3.6
markdown-it-py==3.0.0
MarkupSafe==2.1.5
marshmallow==3.21.3
matplotlib==3.9.1
matplotlib-inline==0.1.7
mdurl==0.1.2
mediafire==0.6.1
megatools==0.0.4
memgpt==0.1.0
mergedeep==1.3.4
mistral_common==1.3.4
mistune==3.0.2
mkdocs==1.6.1
mkdocs-autorefs==1.2.0
mkdocs-get-deps==0.2.0
mkdocs-git-committers-plugin-2==2.3.0
mkdocs-git-revision-date-localized-plugin==1.2.8
mkdocs-material==9.5.34
mkdocs-material-extensions==1.3.1
mkdocs-mermaid2-plugin==1.1.1
mkdocs-section-index==0.3.9
mkdocstrings==0.26.1
mkdocstrings-python==1.11.1
ml-dtypes==0.4.0
mmh3==4.1.0
mock==5.1.0
modal==0.64.94
monotonic==1.6
more-itertools==10.3.0
mpmath==1.3.0
msal==1.30.0
msal-extensions==1.2.0
msgpack==1.0.8
msgpack-python==0.5.6
msgspec==0.18.6
multidict==6.1.0
multiprocess==0.70.16
musicbrainzngs==0.7.1
mypy-extensions==1.0.0
namex==0.0.8
nbclient==0.10.0
nbconvert==7.16.4
nbformat==5.10.4
nest-asyncio==1.6.0
netaddr==1.3.0
netifaces==0.11.0
networkx==3.3
nltk==3.8.1
notebook==7.2.1
notebook_shim==0.2.4
numba==0.60.0
numpy==1.26.4
nvidia-cublas-cu11==11.11.3.6
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu11==11.8.87
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu11==11.8.89
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu11==11.8.89
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu11==9.2.1.18
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu11==10.9.0.58
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu11==10.3.0.86
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu11==11.4.1.48
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu11==11.7.5.86
nvidia-cusparse-cu12==12.1.0.106
nvidia-ml-py==12.560.30
nvidia-nccl-cu11==2.21.5
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.5.82
nvidia-nvtx-cu11==11.8.86
nvidia-nvtx-cu12==12.1.105
oauth2client==4.1.3
oauthlib==3.2.2
olefile==0.46
onnxruntime==1.18.1
openai==1.37.1
opentelemetry-api==1.26.0
opentelemetry-exporter-otlp-proto-common==1.26.0
opentelemetry-exporter-otlp-proto-grpc==1.26.0
opentelemetry-instrumentation==0.47b0
opentelemetry-instrumentation-asgi==0.47b0
opentelemetry-instrumentation-fastapi==0.47b0
opentelemetry-proto==1.26.0
opentelemetry-sdk==1.26.0
opentelemetry-semantic-conventions==0.47b0
opentelemetry-util-http==0.47b0
opt-einsum==3.3.0
optree==0.12.1
orjson==3.10.6
os-diskconfig-python-novaclient-ext==0.1.3
os-networksv2-python-novaclient-ext==0.26
os-virtual-interfacesv2-python-novaclient-ext==0.20
oslo.config==4.12.0
oslo.i18n==3.12.0
oslo.serialization==2.16.1
oslo.utils==3.22.3
outlines @ git+https://github.com/lapp0/outlines.git@dc21915
overrides==7.4.0
packaging==24.1
paginate==0.5.7
pandas==2.2.2
pandocfilters==1.5.0
paramiko==2.9.3
parsedatetime==2.6
parso==0.8.3
partial-json-parser==0.2.1.1.post4
pathspec==0.11.2
pathtools==0.1.2
pbr==1.10.0
perscache==0.6.1
pexpect==4.8.0
pickleshare==0.7.5
pillow==10.4.0
pipx==1.0.0
platformdirs==3.10.0
pluggy==1.3.0
ply==3.11
pop-transition==1.1.2
portalocker==2.8.2
positional==1.2.1
postgrest==0.16.10
posthog==3.3.4
prettytable==0.7.2
prometheus-fastapi-instrumentator==7.0.0
prometheus_client==0.20.0
prompt_toolkit==3.0.47
proto-plus==1.24.0
protobuf==4.23.3
psutil==5.9.5
ptyprocess==0.7.0
pulsar-client==3.4.0
pure-eval==0.2.2
py==1.11.0
py-cpuinfo==9.0.0
pyairports==2.1.1
pyarrow==17.0.0
pyarrow-hotfix==0.6
pyasn1==0.5.0
pyasn1-modules==0.3.0
pyautogen==0.2.9
pybind11==2.10.4
pycairo==1.20.1
pycountry==24.6.1
pycparser==2.21
pycups==2.0.1
pydantic==2.9.1
pydantic_core==2.23.3
pydbus==0.6.0
PyDrive2==1.20.0
pydub==0.25.1
Pygments==2.18.0
PyGObject==3.42.1
PyICU==2.8.1
pyinotify==0.9.6
PyJWT==2.3.0
pylance==0.15.0
pyliblo==0.10.0
pymacaroons==0.13.0
pymdown-extensions==10.9
pymemgpt==0.3.5
PyMuPDF==1.23.26
PyMuPDFb==1.23.22
PyNaCl==1.5.0
pyOpenSSL==21.0.0
pyparsing==2.4.7
pypdf==4.3.1
PyPika==0.48.9
pyproject_hooks==1.0.0
PyQt5==5.15.6
PyQt5-sip==12.9.1
pyrax==1.10.0
pyre-extensions==0.0.29
pyRFC3339==1.1
pyrsistent==0.18.1
PySocks==1.7.1
pytest==7.4.0
python-apt==2.4.0+ubuntu3
python-box==7.1.1
python-dateutil==2.9.0.post0
python-debian==0.1.43+ubuntu1.1
python-dotenv==0.19.2
python-gettext==5.0
python-gnupg==0.4.8
python-json-logger==2.0.7
python-keystoneclient==3.10.0
python-multipart==0.0.9
python-novaclient==2.27.0
python-swiftclient==4.6.0
python-xlib==0.29
python3-discogs-client==2.3.5
pytz==2024.1
pyxdg==0.27
PyYAML==6.0.2
pyyaml_env_tag==0.1
pyzmq==25.1.1
qtconsole==5.4.3
QtPy==2.4.0
questionary==2.0.1
rackspace-auth-openstack==1.3
rackspace-novaclient==2.1
ratelimiter==1.2.0.post0
rax-default-network-flags-python-novaclient-ext==0.4.0
rax-scheduled-images-python-novaclient-ext==0.3.1
ray==2.35.0
rdflib==6.1.1
realtime==2.0.1
referencing==0.35.1
regex==2023.6.3
repolib==2.2.1
repoman==1.4.0
requests==2.32.3
requests-oauthlib==1.3.1
requests-toolbelt==0.9.1
retry==0.9.2
rfc3339-validator==0.1.4
rfc3986==2.0.0
rfc3986-validator==0.1.1
rich==13.6.0
rpds-py==0.20.0
rsa==4.9
ruff==0.5.5
s3transfer==0.10.2
safetensors==0.4.5
scipy==1.11.1
screen-resolution-extra==0.0.0
SecretStorage==3.3.1
semantic-version==2.10.0
semver==3.0.2
Send2Trash==1.8.2
sentencepiece==0.2.0
sentry-sdk==1.30.0
sessioninstaller==0.0.0
setproctitle==1.3.2
shellingham==1.5.4
sigtools==4.0.1
simplejson==3.19.2
six==1.16.0
smmap==5.0.0
sniffio==1.3.0
soupsieve==2.4.1
SPARQLWrapper==1.8.5
speedtest-cli==2.1.3
SQLAlchemy==2.0.25
sqlalchemy-json==0.7.0
sqlmodel==0.0.16
ssh-import-id==5.11
stack-data==0.6.2
stanio==0.5.1
starlette==0.37.2
stevedore==1.20.1
stone==3.3.1
storage3==0.7.7
StrEnum==0.4.15
striprtf==0.0.26
supafunc==0.5.1
sympy==1.12
synchronicity==0.7.6
systemd-python==234
tabulate==0.8.9
tenacity==8.2.3
tensorboard==2.12.3
tensorboard-data-server==0.7.1
tensorflow-estimator==2.12.0
tensorflow-io-gcs-filesystem==0.32.0
tensorflow-probability==0.20.1
termcolor==2.3.0
terminado==0.17.1
texttable==1.6.4
tiktoken==0.7.0
tinycss2==1.2.1
tlslite-ng==0.7.6
tokenizers==0.19.1
toml==0.10.2
tomli==2.0.1
tomlkit==0.12.0
toolz==0.12.0
torch==2.4.0
torchaudio==2.0.2+cu118
torchvision==0.19.0
tornado==6.3.3
tqdm==4.66.5
traitlets==5.14.3
transformers==4.44.2
triton==3.0.0
typer==0.12.3
types-certifi==2021.10.8.3
types-python-dateutil==2.9.0.20240316
types-toml==0.10.8.20240310
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.1
ubuntu-drivers-common==0.0.0
ubuntu-pro-client==8001
ufw==0.36.1
uri-template==1.3.0
uritemplate==4.1.1
urllib3==2.2.2
userpath==1.8.0
uuid7==0.1.0
uvicorn==0.23.2
uvloop==0.19.0
virtualenv==20.25.0
vllm==0.6.0
vllm-flash-attn==2.6.1
voyageai==0.2.3
wadllib==1.3.6
wandb==0.15.5
watchdog==4.0.1
watchfiles==0.21.0
wcwidth==0.2.6
webcolors==24.6.0
webdavclient3==3.14.6
webencodings==0.5.1
websocket-client==1.8.0
websockets==11.0.3
Werkzeug==2.3.6
widgetsnbextension==4.0.11
wrapt==1.14.1
xdg==5
xformers==0.0.27.post2
xkit==0.0.0
xxhash==3.5.0
yarl==1.11.1
zipp==1.0.0
zope.component==4.3.0
zope.event==4.4
zope.hookable==5.1.0
zope.interface==5.4.0
```
|
@cpfiffer Outlines' llama.cpp integration doesn't support multiple samples at once. Could you try again with
If there's a problem with @PierreCarceller Can you please share your schema? The original issue here seems to be resolved by the |
I believe the choice issue is in #1109. Can't believe I missed the multiple samples error, @cpfiffer Outlines' llama.cpp integration doesn't support multiple samples at once. Could you try again with samples=1 (or just don't explicitly set the sampler) works. |
Describe the issue as clearly as possible:
When using
models.llamacpp
and creating JSON using a Pydantic model I get an error when generating the first result (see code to reproduce below). I have runt his code usingmodels.transformers
with no issue.The model I'm using in this example is taken directly from the Chain of Thought cook book example, but I have also tried others and had the same issue.
Steps/code to reproduce the bug:
Expected result:
Error message:
Outlines/Python version information:
Version information
Context for the issue:
This issue came up while working on an ODSC workshop covering outlines. I ended up going with
transformers
instead ofllama_cpp
.The text was updated successfully, but these errors were encountered: