Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server: various fixes for the prompt field in /completion #5300

Merged
merged 1 commit into from
Feb 6, 2024

Conversation

Niall-
Copy link
Contributor

@Niall- Niall- commented Feb 3, 2024

The two main problems were:

  • split_multiprompt_task() was completely stalling server slots if the prompt array had any numbers/tokens in it, the git commit calls this a deadlock but that's probably wrong.
  • prompts with an array were being treated as separate prompts, regardless if they had numbers in.

Either of these issues prevent tokens from being used in the prompt array.

Commit 48c857a also introduced a bug by removing the return before split_multiprompt_task() which caused an unused and unnecessary single prompt generation of all combined elements in the prompt array.

Finally, an unnecessary assert was removed. This assert wouldn't abort unless #include <cassert> was included, but that seems unnecessary. Was changed to send_error() instead. The code should never be hit, but if it is the resulting page will be 404 (as per the following code).

Tested the following prompts with curl -w "\n" --request POST --url http://localhost:8080/completion -d '{"n_predict":24}':

"prompt":"hello"
"prompt":["hello"]
"prompt":["hello", "hi"]
"prompt":["complete",272,2296,12271,28747,6312,28709]
"prompt":[4160, "the following sentence: hello"]
"prompt":[4160,272,2296,12271,28747,6312,28709]

Not tested: /infill with no prompt. cf. #4027 - it's segfaulting for me even with no changes. I don't think these changes will reintroduce that bug.

server : fix deadlock when prompt array contains strings and numbers

server : removed an unnecessary generation when generating multi-prompts

server : removed an unnecessary assert
@Green-Sky
Copy link
Collaborator

Can you also test images as part of the array? (afaik its "[img-NUM]" + image data in another json entry)

@jxy
Copy link
Contributor

jxy commented Feb 3, 2024

I wanted the old behavior so I just commented out the lines in my local build. We were discussing about how to deal with the conflicting intents using the same prompt format, using arrays for either multiple prompts or concatenated array prompt, see #4476

It feels less intuitive if the treatment of a prompt array depends on the element type of the array. I'm not sure which is better, 2-level nested array, or use separate names ("prompt" and "prompts").

@ggerganov ggerganov added the need feedback Testing and feedback with results are needed label Feb 5, 2024
@Niall-
Copy link
Contributor Author

Niall- commented Feb 5, 2024

After spending some time testing a llava (-m llava-v1.5-7b/ggml-model-q8_0.gguf --mmproj llava-v1.5-7b/mmproj-model-f16.gguf) I don't think this PR changes any behaviour with regards to images but I can't be sure. I tested everything mentioned in this post both with and without the patch, and the issues mentioned exist both with and without the patch. As far as what happens when there's images in the prompt array: if there's only strings in the array then they get processed by split_multiprompt_task() which then feeds them back into request_completion() again to be processed as single prompts through queue_tasks.post(task).

What works: "prompt": "[img-12] Describe the following image", "image_data": [{"data": enc(dog), "id": 12}]
A single image prompt, along with all the tests mentioned previously.

What doesn't work: "prompt": [12345, 31232, "complete the following sentence: hello ", "[img-12] Describe the following image", "[img-13] Describe the following image"], "image_data": [{"data": enc(dog), "id": 12}, {"data": enc(cat), "id": 13}]
A prompt with tokens and an "image_data" key will stall the current slot and you won't get a response. This patch didn't introduce this behaviour, and it fails with or without the [img-12] tag, or with just a single object in the image_data key.

Where things get murky: "prompt": ["[img-12] Describe the following image", "[img-13] Describe the following image"], "image_data": [{"data": enc(dog), "id": 12}, {"data": enc(cat), "id": 13}]
After spending an inordinate amount of time testing this, it appears that an array of prompts describing separate images tends merge the images and "content" responses together, for lack of a better word. This patch also didn't introduce this behaviour.

Here's an example of what I mean, with llama.cpp before this PR:

"content": ": A cat is sitting on a white staircase, looking out over the city.in this urban setting, [...]",
"content": ":\n\nThe image depicts a city scene with a large cat laying on the edge of a white staircase. [...]",

After this PR:

"content": ":\nThe picture features a city landscape with tall buildings and skyscrapers surrounding the area. The central focus of this scene is a cat [...]",
"content": ". Cats and buildings are the main subjects of this picture. [...]",

The two images used were a cat and a city skyline. There may still be some similarities in the images as the cat is on white stairs, however after dozens of prompts on both repos with and without the patch I don't think it matters. Initially I was testing with a picture of a dog rather than a city skyline, which was incredibly frustrating as the repo without the patch would occasionally distinguish between the two. If anybody else is going to repeat this test I strongly suggest using two unambiguously different images.

To make matters even worse I added a regular prompt without an [img-id] tag into the mix: "prompt": ["complete the following sentence: hello", "[img-12] Describe the following image", "[img-13] Describe the following image"], "image_data": [{"data": enc(dog), "id": 12}, {"data": enc(cat), "id": 13}]
The result before this patch:

"content": ",\n\nThe cat is lying on the white wall of a building in an urban environment.",
"prompt": ""

"content": " in the most descriptive manner you can: A cat is laying on a white staircase, with buildings and an orange sky [...]",
"prompt": ""

"content": ". \n\nA cat is lounging on a white concrete step, [...] The setting suggests that it's an urban environment with buildings in the background. [...]",
"prompt": ""

After patch:

"content": " world.Љ",
"prompt": ""

"content": ".Ћ\n\nThe city is filled with skyscrapers and towers [...] A large cat is sitting at the bottom edge of a white wall [...]",
"prompt": ""

"content": ":\n\nIn the image, a large cat is lounging [...] \n\nAdditionally, there are multiple skyscrapers visible [...]",
"prompt": ""

The empty "prompt" response here is indicating that the initial prompt is using image data. This becomes clear if you test just a single prompt with and without image_data.

Some things that may have complicated my testing is that I'm using ROCm, I was using the same prompt (with separate image ids) for both of the prompts asking it to describe images, and I was using q8 for the language model and f16 for the image model. I only noticed the latter two issues after the fact, I did briefly test different prompts (what does this image describe? and describe this image along with using f16 for the language model too but neither made a difference.

@jxy
Copy link
Contributor

jxy commented Feb 5, 2024

I don't think this PR changed any image related behavior, either. The parsing and processing of [img-ID] happens at a different stage compared to where the multi-prompt/array-prompt processing happens. And [img-ID] processing was not compatible with prompt arrays to begin with. It would be nice to be able to fit all these pieces nicely together, though.

@ggerganov ggerganov merged commit 4ffc7a1 into ggerganov:master Feb 6, 2024
52 of 53 checks passed
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
…#5300)

server : fix deadlock when prompt array contains strings and numbers

server : removed an unnecessary generation when generating multi-prompts

server : removed an unnecessary assert
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
…#5300)

server : fix deadlock when prompt array contains strings and numbers

server : removed an unnecessary generation when generating multi-prompts

server : removed an unnecessary assert
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need feedback Testing and feedback with results are needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants