include total "num_slots" in default_generation_settings_for_props #5349

jparkerweb · 2024-02-05T18:46:13Z

z80maniac · 2024-02-06T17:25:03Z

I think it would've been better to add this field to the root object, not to the default_generation_settings.

IMHO num_slots doesn't have anything to do with "generation settings", and, more importantly, the docs say:

default_generation_settings - [...] has the same fields as the generation_settings response object from the /completion endpoint.

But this PR breaks this promise. Now this num_slots param is basically undocumented.

The other way to fix this is to add num_slots to the generation_settings response object of the /completion endpoint. And, in any case, this param needs to be mentioned somewhere in the docs.

jparkerweb · 2024-02-06T19:47:35Z

@z80maniac my goal was to add the ability to get the total number of slots defined by hitting the /props endpoint. My application uses a Redis queue system to manage the number of simultaneous requests the target Llama.cpp server can accept.

how about removing the slot total from default_generation_settings, but rather add it directly to the /props json response like this?:

    json data = {
        { "user_name",      llama.name_user.c_str() },
        { "assistant_name", llama.name_assistant.c_str() },
        { "default_generation_settings", llama.default_generation_settings_for_props },
        { "total_slots",    llama.params.n_parallel } // Add this line to include the total number of slots
    };

jparkerweb · 2024-02-06T20:03:16Z

@z80maniac updated in a PR here:
#5373

z80maniac · 2024-02-06T21:39:49Z

how about removing the slot total from default_generation_settings, but rather add it directly to the /props json response

Yes, that's what I meant by:

add this field to the root object, not to the default_generation_settings

include total "num_slots" in default_generation_settings_for_props

034403d

ggerganov approved these changes Feb 6, 2024

View reviewed changes

ggerganov merged commit 8a79c59 into ggerganov:master Feb 6, 2024
53 checks passed

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024

server : include total "num_slots" in props endpoint (ggerganov#5349)

d7ca5d5

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

server : include total "num_slots" in props endpoint (ggerganov#5349)

98bd95f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

include total "num_slots" in default_generation_settings_for_props #5349

include total "num_slots" in default_generation_settings_for_props #5349

jparkerweb commented Feb 5, 2024

z80maniac commented Feb 6, 2024

jparkerweb commented Feb 6, 2024

jparkerweb commented Feb 6, 2024

z80maniac commented Feb 6, 2024

include total "num_slots" in default_generation_settings_for_props #5349

include total "num_slots" in default_generation_settings_for_props #5349

Conversation

jparkerweb commented Feb 5, 2024

z80maniac commented Feb 6, 2024

jparkerweb commented Feb 6, 2024

jparkerweb commented Feb 6, 2024

z80maniac commented Feb 6, 2024