Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama_cpp - JSON fails to generate when using Pydantic model with models.llama_cpp #1110

Open
willkurt opened this issue Aug 21, 2024 · 7 comments · May be fixed by lapp0/outlines#88 or #1154
Open

llama_cpp - JSON fails to generate when using Pydantic model with models.llama_cpp #1110

willkurt opened this issue Aug 21, 2024 · 7 comments · May be fixed by lapp0/outlines#88 or #1154
Labels

Comments

@willkurt
Copy link
Contributor

Describe the issue as clearly as possible:

When using models.llamacpp and creating JSON using a Pydantic model I get an error when generating the first result (see code to reproduce below). I have runt his code using models.transformers with no issue.

The model I'm using in this example is taken directly from the Chain of Thought cook book example, but I have also tried others and had the same issue.

Steps/code to reproduce the bug:

import llama_cpp
from outlines import generate, models
from textwrap import dedent

llama_tokenizer = llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained(
            "NousResearch/Hermes-2-Pro-Llama-3-8B"
            )
tokenizer = llama_tokenizer.hf_tokenizer

model = models.llamacpp("NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF",
            "Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf",
            tokenizer=llama_tokenizer,
            n_gpu_layers=-1,
            flash_attn=True,
            n_ctx=8192,
            verbose=False)

complaint_data = [{'message': 'Hi, my name is Olivia Brown.I recently ordered a knife set from your wellness range, and it arrived earlier this week. Unfortunately, my satisfaction with the product has been less than ideal.My order was A123456',
  'order_number': 'A12-3456',
  'department': 'kitchen'},
 {'message': 'Hi, my name is John Smith.I recently ordered a dress for an upcoming event, which was alleged to meet my expectations both in fit and style. However, upon arrival, it became apparent that the fabric was of subpar quality, leading to a less than satisfactory appearance.The order number is A12-3456',
  'order_number': 'A12-3456',
  'department': 'clothing'},
 {'message': 'Hi, my name is Sarah Johnson.I recently ordered the ultimate ChefMaster 8 Drawer Cooktop. However, upon delivery, I discovered that one of the burners is malfunctioning.My order was A458739',
  'order_number': 'A45-8739',
  'department': 'kitchen'}]

from pydantic import BaseModel, Field, constr
from enum import Enum


class Department(str, Enum):
    clothing = "clothing"
    electronics = "electronics"
    kitchen = "kitchen"
    automotive = "automotive"

class ComplaintData(BaseModel):
    first_name: str
    last_name: str
    order_number: str = Field(pattern=r'[ADZ][0-9]{2}-[0-9]{4}')
    department: Department


def create_prompt(complaint):
    complaint_messages = [
        {
        'role': 'user',
        'content': f"""
        You are a complaint processing assistent, you aim is to process complaints and return the following intformation in this JSON format:
        {{
            'first_name': <first name>,
            'last_name': <last name>,
            'order number': <order number has the following format (ADZ)XX-XXXXX>,
            'department': <{"|".join([e.value for e in Department])}>,
        }}
        """},
        {'role': 'assistant',
         'content': "I undersand and will process the complaints in the JSON format you described"
        },
        {'role': 'user',
        'content': complaint['message']
        }
    ]
    complaint_prompt = tokenizer.apply_chat_template(complaint_messages, tokenize=False)
    return complaint_prompt


if __name__ == "__main__":
    complaint_processor = generate.json(model, ComplaintData)
    results = []
    for complaint in complaint_data[0:10]:
        prompt = create_prompt(complaint)
        result = complaint_processor(prompt)
        print(result)

Expected result:

JSON represented by the Pydantic model.

Error message:

File "/Users/will/.venv/dev/lib/python3.11/site-packages/pydantic/main.py", line 1160, in parse_raw
    obj = parse.load_str_bytes(
          ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/will/.venv/dev/lib/python3.11/site-packages/pydantic/deprecated/parse.py", line 49, in load_str_bytes
    return json_loads(b)  # type: ignore
           ^^^^^^^^^^^^^
  File "/Users/will/.pyenv/versions/3.11.0/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/will/.pyenv/versions/3.11.0/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/will/.pyenv/versions/3.11.0/lib/python3.11/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
               ^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 48 (char 47)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/will/code/notes/llama_json_bug.py", line 74, in <module>
    result = complaint_processor(prompt)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/will/.venv/dev/lib/python3.11/site-packages/outlines/generate/api.py", line 511, in __call__
    return format(completions)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/will/.venv/dev/lib/python3.11/site-packages/outlines/generate/api.py", line 497, in format
    return self.format_sequence(sequences)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/will/.venv/dev/lib/python3.11/site-packages/outlines/generate/json.py", line 50, in <lambda>
    generator.format_sequence = lambda x: schema_object.parse_raw(x)
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/will/.venv/dev/lib/python3.11/site-packages/pydantic/main.py", line 1187, in parse_raw
    raise pydantic_core.ValidationError.from_exception_data(cls.__name__, [error])
pydantic_core._pydantic_core.ValidationError: 1 validation error for ComplaintData
__root__
  Unterminated string starting at: line 1 column 48 (char 47) [type=value_error.jsondecode, input_value='{"first_name": "Olivia", "last_name": "Brown", "', input_type=str]

Outlines/Python version information:

Version information

0.0.46
Python 3.11.0 (main, Jul  6 2024, 12:54:41) [Clang 15.0.0 (clang-1500.3.9.4)]
aiohappyeyeballs==2.4.0
aiohttp==3.10.5
aiosignal==1.3.1
annotated-types==0.7.0
attrs==24.2.0
certifi==2024.7.4
charset-normalizer==3.3.2
cloudpickle==3.0.0
datasets==2.21.0
dill==0.3.8
diskcache==5.6.3
filelock==3.15.4
frozenlist==1.4.1
fsspec==2024.6.1
huggingface-hub==0.24.6
idna==3.7
interegular==0.3.3
Jinja2==3.1.4
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
lark==1.2.2
llama_cpp_python==0.2.89
llvmlite==0.43.0
MarkupSafe==2.1.5
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.16
nest-asyncio==1.6.0
networkx==3.3
numba==0.60.0
numpy==1.26.4
outlines==0.0.46
packaging==24.1
pandas==2.2.2
pyairports==2.1.1
pyarrow==17.0.0
pycountry==24.6.1
pydantic==2.8.2
pydantic_core==2.20.1
python-dateutil==2.9.0.post0
pytz==2024.1
PyYAML==6.0.2
referencing==0.35.1
regex==2024.7.24
requests==2.32.3
rpds-py==0.20.0
safetensors==0.4.4
six==1.16.0
sympy==1.13.2
tokenizers==0.19.1
torch==2.4.0
tqdm==4.66.5
transformers==4.44.1
typing_extensions==4.12.2
tzdata==2024.1
urllib3==2.2.2
xxhash==3.5.0
yarl==1.9.4

Context for the issue:

This issue came up while working on an ODSC workshop covering outlines. I ended up going with transformers instead of llama_cpp.

@PierreCarceller
Copy link

This fix the problem. (Or at least in my case)

For those who would like to test in advance:

pip install git+https://github.com/lapp0/outlines.git@fix-json --force-reinstall

@lapp0
Copy link
Contributor

lapp0 commented Sep 4, 2024

Thanks for directing people to that branch!

Still a work in progress, only a subset of the failure cases are handled right now. Happy to hear more json failure reports to help me ensure I address all problems!

@PierreCarceller
Copy link

@lapp0 since yesterday I've noticed other errors despite the fix already commited. Whenever I come across an error, I'll post it here.

@PierreCarceller
Copy link

Traceback (most recent call last):
  File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/pydantic/main.py", line 1160, in parse_raw
    obj = parse.load_str_bytes(
  File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/pydantic/deprecated/parse.py", line 49, in load_str_bytes
    return json_loads(b)  # type: ignore
  File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 3328 (char 3327)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/home/pierre/llmdoc/outlines_server/outlines_server/DocumentClassifier.py", line 131, in <module>
  File "/mnt/home/pierre/llmdoc/outlines_server/outlines_server/DocumentClassifier.py", line 92, in classify
  File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/outlines/generate/api.py", line 511, in __call__
    return self._format(completions)
  File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/outlines/generate/api.py", line 487, in _format
    return self.format_sequence(sequences)
  File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/outlines/generate/json.py", line 50, in <lambda>
    generator.format_sequence = lambda x: schema_object.parse_raw(x)
  File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/pydantic/main.py", line 1187, in parse_raw
    raise pydantic_core.ValidationError.from_exception_data(cls.__name__, [error])
pydantic_core._pydantic_core.ValidationError: 1 validation error for DocumentClassificationResult
__root__
  Expecting ',' delimiter: line 1 column 3328 (char 3327) [type=value_error.jsondecode, input_value='{ "label": "tax_notice",...99999999999999999999999', input_type=str]

@cpfiffer
Copy link
Contributor

cpfiffer commented Sep 10, 2024

Anything I can help with here? Currently getting bit by this issue, using the dev branch.

import outlines
import llama_cpp

# Load the model
model = outlines.models.llamacpp(
    "NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF",
    "Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf",
    tokenizer=llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained(
        "NousResearch/Hermes-2-Pro-Llama-3-8B"
    ),
    n_gpu_layers=-1,
    n_ctx=8192,
    verbose=False
)

# Specify temperature in sampler
sampler = outlines.samplers.multinomial(samples=100)

# Define the coinflip choice
coinflip_regex_pattern = r"[H|T]"
generator = outlines.generate.choice(
    model,
    ["H", "T"],
    sampler=sampler
)

output = generator("Flip a coin, respond with H or T: ")
print(output)

# Count the occurrences of each outcome
heads_count = output.count("H")
tails_count = output.count("T")

print(f"Heads: {heads_count}, Tails: {tails_count}")

Versions:

```python 3-1==1.0.0 absl-py==2.1.0 accelerate==0.34.2 acme==2.11.0 aiofiles==23.2.1 aiohappyeyeballs==2.4.0 aiohttp==3.10.5 aiolimiter==1.1.0 aiosignal==1.3.1 aiostream==0.5.2 altair==5.3.0 annotated-types==0.7.0 anyio==3.7.1 appdirs==1.4.4 argcomplete==3.4.0 argon2-cffi==23.1.0 argon2-cffi-bindings==21.2.0 args==0.1.0 arrow==1.3.0 asgiref==3.8.1 asttokens==2.4.1 astunparse==1.6.3 async-lru==2.0.4 async-timeout==4.0.3 atom==0.10.5 atomicwrites==1.4.0 attrs==24.2.0 azure-core==1.30.2 azure-identity==1.17.1 azure-storage-blob==12.21.0 b2sdk==2.4.1 Babel==2.15.0 backcall==0.2.0 backoff==2.2.1 backports.tarfile==1.2.0 bcc==0.18.0 bcrypt==4.2.0 beartype==0.18.5 beautifulsoup4==4.12.3 bitsandbytes==0.43.3 black==24.4.2 bleach==6.1.0 blinker==1.8.2 boto3==1.34.149 botocore==1.34.149 boxsdk==3.11.0 Brlapi==0.8.3 bs4==0.0.2 build==1.2.1 cachetools==5.4.0 cairocffi==1.7.1 CairoSVG==2.7.1 certbot==2.11.0 certbot-nginx==2.11.0 certifi==2024.8.30 cffi==1.16.0 chardet==3.0.4 charset-normalizer==3.3.2 chroma-hnswlib==0.7.6 chromadb==0.5.5 chrome-gnome-shell==0.0.0 click==8.1.7 clint==0.5.1 cloudpickle==3.0.0 cmake==3.30.1 cmdstanpy==1.2.4 colorama==0.4.6 coloredlogs==15.0.1 comm==0.2.2 command-not-found==0.3 ConfigArgParse==1.7 configobj==5.0.8 contourpy==1.2.1 coverage==6.2 cryptography==3.4.8 cssselect2==0.7.0 cupshelpers==1.0 cycler==0.12.1 dataclasses-json==0.6.7 datasets==2.21.0 dbus-python==1.2.18 debtcollector==3.0.0 debugpy==1.8.2 decorator==5.1.1 defer==1.0.6 defusedxml==0.7.1 demjson3==3.0.6 Deprecated==1.2.14 deprecation==2.1.0 dill==0.3.8 dirtyjson==1.0.8 discodos==1.0rc2 diskcache==5.6.3 distlib==0.3.8 distro==1.9.0 dm-tree==0.1.8 dnspython==2.6.1 docker==7.1.0 docker-compose==1.29.2 docker-pycreds==0.4.0 dockerpty==0.4.1 docopt==0.6.2 docstring_parser==0.16 docx2txt==0.8 dropbox==12.0.2 duplicity==3.0.0 ecdsa==0.19.0 EditorConfig==0.12.4 einops==0.8.0 email_validator==2.2.0 exceptiongroup==1.2.2 executing==2.0.1 fail2ban==0.11.2 fairscale==0.4.13 faiss-cpu==1.8.0.post1 fastapi==0.111.1 fastapi-cli==0.0.4 fasteners==0.19 fastjsonschema==2.20.0 fasttext==0.9.3 ffmpy==0.3.2 filelock==3.16.0 fire==0.6.0 FLAML==2.1.2 flatbuffers==24.3.25 fonttools==4.53.1 fqdn==1.5.1 frozenlist==1.4.1 fsspec==2024.6.1 future==1.0.0 gast==0.6.0 gdata-python3==3.0.1 gguf==0.9.1 ghp-import==2.1.0 gitdb==4.0.11 GitPython==3.1.43 google-api-core==2.19.1 google-api-python-client==2.138.0 google-auth==2.32.0 google-auth-httplib2==0.2.0 google-auth-oauthlib==1.2.1 google-pasta==0.2.0 googleapis-common-protos==1.63.2 gotrue==2.7.0 gradio==4.39.0 gradio_client==1.1.1 greenlet==3.0.3 griffe==1.2.0 grpcio==1.65.1 grpclib==0.4.7 h11==0.14.0 h2==4.1.0 h5py==3.11.0 hidpidaemon==18.4.6 hpack==4.0.0 html2text==2024.2.26 html5lib==1.1 httpcore==1.0.5 httpie==3.2.3 httplib2==0.22.0 httptools==0.6.1 httpx==0.27.0 huggingface-hub==0.24.6 humanfriendly==10.0 humanize==4.10.0 hyperframe==6.0.1 icontract==2.6.6 idna==3.8 importlib_metadata==8.0.0 importlib_resources==6.4.0 iniconfig==2.0.0 interegular==0.3.3 ip-associations-python-novaclient-ext==0.2 ipykernel==6.29.5 ipython==8.26.0 ipython-genutils==0.2.0 ipywidgets==8.1.3 iso8601==2.1.0 isodate==0.6.1 isoduration==20.11.0 jaraco.classes==3.4.0 jaraco.context==5.3.0 jaraco.functools==4.0.1 jax==0.4.30 jaxlib==0.4.30 jedi==0.19.1 jeepney==0.8.0 Jinja2==3.1.4 jmespath==1.0.1 joblib==1.4.2 josepy==1.14.0 jottalib==0.5.1 jsbeautifier==1.15.1 json5==0.9.25 jsonpointer==3.0.0 jsonschema==4.23.0 jsonschema-specifications==2023.12.1 jupyter==1.0.0 jupyter-console==6.6.3 jupyter-events==0.10.0 jupyter-lsp==2.2.5 jupyter_client==8.6.2 jupyter_core==5.7.2 jupyter_server==2.14.2 jupyter_server_terminals==0.5.3 jupyterlab==4.2.4 jupyterlab_pygments==0.3.0 jupyterlab_server==2.27.3 jupyterlab_widgets==3.0.11 keras==3.4.1 kernelstub==3.1.4 keyring==25.2.1 keystoneauth1==2.18.0 kiwisolver==1.4.5 kubernetes==30.1.0 lancedb==0.11.0 language-selector==0.1 lark==1.2.2 launchpadlib==2.0.0 lazr.restfulclient==0.14.6 lazr.uri==1.0.6 lib==4.0.0 libclang==18.1.1 lit==18.1.8 livereload==2.7.0 llama-cloud==0.0.11 llama-index==0.10.58 llama-index-agent-openai==0.2.9 llama-index-cli==0.1.13 llama-index-core==0.10.58 llama-index-embeddings-azure-openai==0.1.11 llama-index-embeddings-openai==0.1.11 llama-index-indices-managed-llama-cloud==0.2.7 llama-index-legacy==0.9.48 llama-index-llms-azure-openai==0.1.10 llama-index-llms-openai==0.1.27 llama-index-multi-modal-llms-openai==0.1.8 llama-index-program-openai==0.1.7 llama-index-question-gen-openai==0.1.3 llama-index-readers-file==0.1.31 llama-index-readers-llama-parse==0.1.6 llama-parse==0.4.9 llama_cpp_python==0.2.90 llamaindex-py-client==0.1.19 llvmlite==0.43.0 lm-format-enforcer==0.10.6 lockfile==0.12.2 logfury==1.0.1 louis==3.20.0 lxml==5.2.2 macaroonbakery==1.3.4 Markdown==3.6 markdown-it-py==3.0.0 MarkupSafe==2.1.5 marshmallow==3.21.3 matplotlib==3.9.1 matplotlib-inline==0.1.7 mdurl==0.1.2 mediafire==0.6.1 megatools==0.0.4 memgpt==0.1.0 mergedeep==1.3.4 mistral_common==1.3.4 mistune==3.0.2 mkdocs==1.6.1 mkdocs-autorefs==1.2.0 mkdocs-get-deps==0.2.0 mkdocs-git-committers-plugin-2==2.3.0 mkdocs-git-revision-date-localized-plugin==1.2.8 mkdocs-material==9.5.34 mkdocs-material-extensions==1.3.1 mkdocs-mermaid2-plugin==1.1.1 mkdocs-section-index==0.3.9 mkdocstrings==0.26.1 mkdocstrings-python==1.11.1 ml-dtypes==0.4.0 mmh3==4.1.0 mock==5.1.0 modal==0.64.94 monotonic==1.6 more-itertools==10.3.0 mpmath==1.3.0 msal==1.30.0 msal-extensions==1.2.0 msgpack==1.0.8 msgpack-python==0.5.6 msgspec==0.18.6 multidict==6.1.0 multiprocess==0.70.16 musicbrainzngs==0.7.1 mypy-extensions==1.0.0 namex==0.0.8 nbclient==0.10.0 nbconvert==7.16.4 nbformat==5.10.4 nest-asyncio==1.6.0 netaddr==1.3.0 netifaces==0.11.0 networkx==3.3 nltk==3.8.1 notebook==7.2.1 notebook_shim==0.2.4 numba==0.60.0 numpy==1.26.4 nvidia-cublas-cu11==11.11.3.6 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu11==11.8.87 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu11==11.8.89 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu11==11.8.89 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu11==9.2.1.18 nvidia-cudnn-cu12==9.1.0.70 nvidia-cufft-cu11==10.9.0.58 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu11==10.3.0.86 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu11==11.4.1.48 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu11==11.7.5.86 nvidia-cusparse-cu12==12.1.0.106 nvidia-ml-py==12.560.30 nvidia-nccl-cu11==2.21.5 nvidia-nccl-cu12==2.20.5 nvidia-nvjitlink-cu12==12.5.82 nvidia-nvtx-cu11==11.8.86 nvidia-nvtx-cu12==12.1.105 oauth2client==4.1.3 oauthlib==3.2.2 olefile==0.46 onnxruntime==1.18.1 openai==1.37.1 opentelemetry-api==1.26.0 opentelemetry-exporter-otlp-proto-common==1.26.0 opentelemetry-exporter-otlp-proto-grpc==1.26.0 opentelemetry-instrumentation==0.47b0 opentelemetry-instrumentation-asgi==0.47b0 opentelemetry-instrumentation-fastapi==0.47b0 opentelemetry-proto==1.26.0 opentelemetry-sdk==1.26.0 opentelemetry-semantic-conventions==0.47b0 opentelemetry-util-http==0.47b0 opt-einsum==3.3.0 optree==0.12.1 orjson==3.10.6 os-diskconfig-python-novaclient-ext==0.1.3 os-networksv2-python-novaclient-ext==0.26 os-virtual-interfacesv2-python-novaclient-ext==0.20 oslo.config==4.12.0 oslo.i18n==3.12.0 oslo.serialization==2.16.1 oslo.utils==3.22.3 outlines @ git+https://github.com/lapp0/outlines.git@dc21915 overrides==7.4.0 packaging==24.1 paginate==0.5.7 pandas==2.2.2 pandocfilters==1.5.0 paramiko==2.9.3 parsedatetime==2.6 parso==0.8.3 partial-json-parser==0.2.1.1.post4 pathspec==0.11.2 pathtools==0.1.2 pbr==1.10.0 perscache==0.6.1 pexpect==4.8.0 pickleshare==0.7.5 pillow==10.4.0 pipx==1.0.0 platformdirs==3.10.0 pluggy==1.3.0 ply==3.11 pop-transition==1.1.2 portalocker==2.8.2 positional==1.2.1 postgrest==0.16.10 posthog==3.3.4 prettytable==0.7.2 prometheus-fastapi-instrumentator==7.0.0 prometheus_client==0.20.0 prompt_toolkit==3.0.47 proto-plus==1.24.0 protobuf==4.23.3 psutil==5.9.5 ptyprocess==0.7.0 pulsar-client==3.4.0 pure-eval==0.2.2 py==1.11.0 py-cpuinfo==9.0.0 pyairports==2.1.1 pyarrow==17.0.0 pyarrow-hotfix==0.6 pyasn1==0.5.0 pyasn1-modules==0.3.0 pyautogen==0.2.9 pybind11==2.10.4 pycairo==1.20.1 pycountry==24.6.1 pycparser==2.21 pycups==2.0.1 pydantic==2.9.1 pydantic_core==2.23.3 pydbus==0.6.0 PyDrive2==1.20.0 pydub==0.25.1 Pygments==2.18.0 PyGObject==3.42.1 PyICU==2.8.1 pyinotify==0.9.6 PyJWT==2.3.0 pylance==0.15.0 pyliblo==0.10.0 pymacaroons==0.13.0 pymdown-extensions==10.9 pymemgpt==0.3.5 PyMuPDF==1.23.26 PyMuPDFb==1.23.22 PyNaCl==1.5.0 pyOpenSSL==21.0.0 pyparsing==2.4.7 pypdf==4.3.1 PyPika==0.48.9 pyproject_hooks==1.0.0 PyQt5==5.15.6 PyQt5-sip==12.9.1 pyrax==1.10.0 pyre-extensions==0.0.29 pyRFC3339==1.1 pyrsistent==0.18.1 PySocks==1.7.1 pytest==7.4.0 python-apt==2.4.0+ubuntu3 python-box==7.1.1 python-dateutil==2.9.0.post0 python-debian==0.1.43+ubuntu1.1 python-dotenv==0.19.2 python-gettext==5.0 python-gnupg==0.4.8 python-json-logger==2.0.7 python-keystoneclient==3.10.0 python-multipart==0.0.9 python-novaclient==2.27.0 python-swiftclient==4.6.0 python-xlib==0.29 python3-discogs-client==2.3.5 pytz==2024.1 pyxdg==0.27 PyYAML==6.0.2 pyyaml_env_tag==0.1 pyzmq==25.1.1 qtconsole==5.4.3 QtPy==2.4.0 questionary==2.0.1 rackspace-auth-openstack==1.3 rackspace-novaclient==2.1 ratelimiter==1.2.0.post0 rax-default-network-flags-python-novaclient-ext==0.4.0 rax-scheduled-images-python-novaclient-ext==0.3.1 ray==2.35.0 rdflib==6.1.1 realtime==2.0.1 referencing==0.35.1 regex==2023.6.3 repolib==2.2.1 repoman==1.4.0 requests==2.32.3 requests-oauthlib==1.3.1 requests-toolbelt==0.9.1 retry==0.9.2 rfc3339-validator==0.1.4 rfc3986==2.0.0 rfc3986-validator==0.1.1 rich==13.6.0 rpds-py==0.20.0 rsa==4.9 ruff==0.5.5 s3transfer==0.10.2 safetensors==0.4.5 scipy==1.11.1 screen-resolution-extra==0.0.0 SecretStorage==3.3.1 semantic-version==2.10.0 semver==3.0.2 Send2Trash==1.8.2 sentencepiece==0.2.0 sentry-sdk==1.30.0 sessioninstaller==0.0.0 setproctitle==1.3.2 shellingham==1.5.4 sigtools==4.0.1 simplejson==3.19.2 six==1.16.0 smmap==5.0.0 sniffio==1.3.0 soupsieve==2.4.1 SPARQLWrapper==1.8.5 speedtest-cli==2.1.3 SQLAlchemy==2.0.25 sqlalchemy-json==0.7.0 sqlmodel==0.0.16 ssh-import-id==5.11 stack-data==0.6.2 stanio==0.5.1 starlette==0.37.2 stevedore==1.20.1 stone==3.3.1 storage3==0.7.7 StrEnum==0.4.15 striprtf==0.0.26 supafunc==0.5.1 sympy==1.12 synchronicity==0.7.6 systemd-python==234 tabulate==0.8.9 tenacity==8.2.3 tensorboard==2.12.3 tensorboard-data-server==0.7.1 tensorflow-estimator==2.12.0 tensorflow-io-gcs-filesystem==0.32.0 tensorflow-probability==0.20.1 termcolor==2.3.0 terminado==0.17.1 texttable==1.6.4 tiktoken==0.7.0 tinycss2==1.2.1 tlslite-ng==0.7.6 tokenizers==0.19.1 toml==0.10.2 tomli==2.0.1 tomlkit==0.12.0 toolz==0.12.0 torch==2.4.0 torchaudio==2.0.2+cu118 torchvision==0.19.0 tornado==6.3.3 tqdm==4.66.5 traitlets==5.14.3 transformers==4.44.2 triton==3.0.0 typer==0.12.3 types-certifi==2021.10.8.3 types-python-dateutil==2.9.0.20240316 types-toml==0.10.8.20240310 typing-inspect==0.9.0 typing_extensions==4.12.2 tzdata==2024.1 ubuntu-drivers-common==0.0.0 ubuntu-pro-client==8001 ufw==0.36.1 uri-template==1.3.0 uritemplate==4.1.1 urllib3==2.2.2 userpath==1.8.0 uuid7==0.1.0 uvicorn==0.23.2 uvloop==0.19.0 virtualenv==20.25.0 vllm==0.6.0 vllm-flash-attn==2.6.1 voyageai==0.2.3 wadllib==1.3.6 wandb==0.15.5 watchdog==4.0.1 watchfiles==0.21.0 wcwidth==0.2.6 webcolors==24.6.0 webdavclient3==3.14.6 webencodings==0.5.1 websocket-client==1.8.0 websockets==11.0.3 Werkzeug==2.3.6 widgetsnbextension==4.0.11 wrapt==1.14.1 xdg==5 xformers==0.0.27.post2 xkit==0.0.0 xxhash==3.5.0 yarl==1.11.1 zipp==1.0.0 zope.component==4.3.0 zope.event==4.4 zope.hookable==5.1.0 zope.interface==5.4.0 ```

@lapp0
Copy link
Contributor

lapp0 commented Sep 15, 2024

@cpfiffer Outlines' llama.cpp integration doesn't support multiple samples at once. Could you try again with samples=1 (or just don't explicitly set the sampler)

generator = outlines.generate.choice(
    model,
    ["H", "T"],
)

If there's a problem with generate.choice could you also open a separate issue?

@PierreCarceller Can you please share your schema?

The original issue here seems to be resolved by the fix-json branch, except is re-uses logits processors. A new logits processor must be created for each run. This is a separate bug in generate.json / outlines.processors which should be addressed as part of this issue.

@cpfiffer
Copy link
Contributor

I believe the choice issue is in #1109.

Can't believe I missed the multiple samples error,

@cpfiffer Outlines' llama.cpp integration doesn't support multiple samples at once. Could you try again with samples=1 (or just don't explicitly set the sampler)

works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants