[Feature request] low level cpp & python API for generation #51

UranusSeven · 2023-07-13T03:08:33Z

Currently chatglm.cpp provides Pipeline for users. The Pipeline class provides a method called chat, which handles the system prompt, chat history, output formatting and more.

This is awesome. But for better flexibility, a low level API generate is also needed for integrate chatglm.cpp with other systems like text-generation-webui, LangChain, and Xinference.

Here are some detailed needs for the method generate (in Python):

def generate(
    self,
    prompt: str,  # the full prompt passed directly to the model
    stop: Optional[Union[str, List[str]]] = [],  # stop words
    max_tokens: int = 128,
    temperature: float = 0.8,
    top_p: float = 0.95,
    top_k: int = 40,
    **other_generate_kwargs
) -> Completion:
    pass


class Completion(TypedDict):
    text: str
    index: int
    prompt_tokens: int
    completion_tokens: int
    total_tokens: int
    finish_reason: Optional[str]  # "stop" for eos met and "length" for max_tokens reached

The text was updated successfully, but these errors were encountered:

UranusSeven · 2023-07-17T02:36:15Z

@li-plus Hi, do you have any comment on this issue? I'll be glad to implement it if you don't mind :)

li-plus · 2023-07-17T17:30:19Z

@UranusSeven Hi, sorry for the late reply. Integration into popular frontends is indeed one of the TODOs. I just read the code of llama-cpp-python: https://github.com/abetlen/llama-cpp-python/blob/6d8892fe64ca7eadd503ae01f93fbcd9ff3806dd/llama_cpp/llama.py#L1229-L1251. How about we call it create_completion instead of generate to align with them. It'll be awesome if you would help implement it. If you don't have time, I'll handle it. Anyway, thanks for you advice!

UranusSeven · 2023-07-19T07:24:54Z

@li-plus Totally agree! Having similar interface with llama-cpp-python is awesome!
I'll try to resolve this issue this weekend :)

lloydzhou mentioned this issue Jul 19, 2023

api demo and dockerfile #57

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] low level cpp & python API for generation #51

[Feature request] low level cpp & python API for generation #51

UranusSeven commented Jul 13, 2023

UranusSeven commented Jul 17, 2023

li-plus commented Jul 17, 2023

UranusSeven commented Jul 19, 2023

[Feature request] low level cpp & python API for generation #51

[Feature request] low level cpp & python API for generation #51

Comments

UranusSeven commented Jul 13, 2023

UranusSeven commented Jul 17, 2023

li-plus commented Jul 17, 2023

UranusSeven commented Jul 19, 2023