Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oobabooga support for local generations? #2

Open
hexive opened this issue Aug 13, 2024 · 5 comments
Open

Oobabooga support for local generations? #2

hexive opened this issue Aug 13, 2024 · 5 comments
Assignees

Comments

@hexive
Copy link

hexive commented Aug 13, 2024

Any chance you'd consider adding Oobabooga support as a provider?
https://github.com/oobabooga/text-generation-webui

They have an API that is "drop in compatible" with OpenAI so maybe it wouldn't be too much work?

Retrochat is a super fun tool. Thanks for putting it out there. I'm using it mostly with ollama, but there are some models that run better and faster locally with ooba.

@DefamationStation DefamationStation self-assigned this Aug 14, 2024
@DefamationStation
Copy link
Owner

DefamationStation commented Aug 14, 2024

Can you please test if it works because I don't have that provider

Make sure you use a few features like /load and @ to test RAG and if the markdown syntax appears properly and let me know so i can push it to the main script!

https://github.com/DefamationStation/Retrochat-v2/blob/main/retrochat_oogabooga.py

it looks at http://127.0.0.1:5000

@hexive
Copy link
Author

hexive commented Aug 15, 2024

ooh this is awesome! thank you!

on a quick test right now, I was able to connect through the API on localhost:5000 just fine and it looks like output is being generated by ooba, but nothing is being displayed on the retrochat side.

I can dig deeper into the oobaboogaChatSession class tonight and see if I can figure out what's going on with that.

@hexive
Copy link
Author

hexive commented Aug 15, 2024

I'm not a programmer, but adding these lines to the ChatSession class got the output to print in Retrochat

async for line in response.content:
    if line:
        try:
            decoded_line = line.decode('utf-8').strip()
            if decoded_line.startswith('data: '):
                json_str = decoded_line[6:]  # Remove 'data: ' prefix
                if json_str != '[DONE]':
                    response_json = json.loads(json_str)
                    message_content = response_json['choices'][0]['delta'].get('content', '')
                    if message_content:
                        complete_message += message_content
                        yield message_content  # Yield each chunk for streaming
        except json.JSONDecodeError:
            continue

That's from Claude, haha. Does that make any sense to you?

Although, for a reason I don't understand this prints asynchronously and then prints a duplicate of the text again at the end.

@DefamationStation
Copy link
Owner

DefamationStation commented Aug 17, 2024

Im trying to implement it properly and I am running the Text Generation WebUI but I'm not sure how to make calls to it or even run it as a server, could you please provide some additional information so i can get it up and running and test it?
image
I can open the WebUI and run models just fine on it.

@hexive
Copy link
Author

hexive commented Aug 17, 2024

oh sure, you probably just need to start the program with the --api flag. that will open the default socket on :5000 and that's how I was able to successfully send and receive with Retrochat in my tests.

so however you are launching the WebUI just add --api at the end.

edit to add: make sure you have a model loaded in WebUI or you'll get some strange errors when you start the chat in Retrochat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants