Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for users to specify custom request settings, model and optionally provider specific #14535

Merged
merged 12 commits into from
Nov 28, 2024

Conversation

JonasHelming
Copy link
Contributor

@JonasHelming JonasHelming commented Nov 26, 2024

What it does

Add support for users to specify custom request settings, model and optionally provider specific.
The reason for making them provider specific is that providers have different options and sometimes even different names for the same (see below)

How to test

Add the settings below and adapt it to models you have.
Qwen/Qwen2.5-Coder-32B-Instruct is always "warm" on serverless Hugginface
Starcoder2.3B can be downloaded here: https://huggingface.co/Mozilla/starcoder2-llamafile/tree/main
gemma2 can be directly downloaded with ollama (ollama serve and ollama run gemma2)

Two good test cases:

  • set the max_length to something small and observe the model to stop
  • set a stop token and ask the model to say it :-)
"ai-features.modelSettings.requestSettings": [
        {
            "modelId": "Qwen/Qwen2.5-Coder-32B-Instruct",
            "requestSettings": {
                "max_new_tokens": 2048,
                "stop": [
                    "<|im_end|>",
                ]
            },
            "providerId": "huggingface"
        },
        {
            "modelId": "gemma2",
            "requestSettings": {
                "num_predict": 1024,
                "stop": [
                    "<|endoftext|>"
                ],
            },
            "providerId": "ollama"
        },
        {
            "modelId": "StarCoder2.3B",
            "requestSettings": {
                "n_predict": 200,
                "stream": true,
                "stop": [
                    "<file_sep>",
                    "<|endoftext|>"
                ],
                "cache_prompt": true,
            },
            "providerId": "llamafile"
        },
        {
            "modelId": "gpt-4o-2024-05-13",
            "requestSettings": {
                "max_tokens": 10,
                "stop": [
                    "<|im_end|>"
                ]
            },
            "providerId": "openai"
        },
    ]

Follow-ups

This should be the last thing we add to the provider layer before we refactor it all together:

  • Make the settings more concistent
  • Remove a lot of code duplication

Review checklist

Reminder for reviewers

@JonasHelming JonasHelming requested a review from sdirix November 26, 2024 20:44
@JonasHelming
Copy link
Contributor Author

@dhuebner Could you check the Ollama adaptations please?

@JonasHelming JonasHelming changed the title Gh 14526 Add support for users to specify custom request settings, model and optionally provider specific Nov 26, 2024
Signed-off-by: Jonas Helming <[email protected]>
Signed-off-by: Jonas Helming <[email protected]>
@JonasHelming
Copy link
Contributor Author

See this for docu: eclipse-theia/theia-website#662

@dhuebner
Copy link
Member

@JonasHelming
Sure, will do.

@dhuebner
Copy link
Member

dhuebner commented Nov 27, 2024

@JonasHelming
Ollama works as expected!

@sdirix
I have a question to another topic. Would it be possible to mark a LanguageModelRequest with an explicit flag that tells if it's a chat or a completion request? Having this information model implementors could decide if they should use stream or text response.

@JonasHelming
Copy link
Contributor Author

@sdirix
I have a question to another topic. Would it be possible to mark a LanguageModelRequest with an explicit flag that tells if it's a chat or a completion request? Having this information model implementors could decide if they should use stream or text response.

@dhuebner I also already thought about this, would you mind creating a new ticket and mark me there?

Copy link
Member

@sdirix sdirix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work ❤️ ! I found some inconsistencies which should be fixed before we merge.

@JonasHelming
Copy link
Contributor Author

Thank you for the great review. I tried to address all comments (in individual commits) and tested all providers again with the final state.

@JonasHelming JonasHelming requested a review from sdirix November 28, 2024 00:24
Copy link
Member

@sdirix sdirix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. I tested with all model integrations we offer.

Thanks for the great work ❤️

@JonasHelming JonasHelming merged commit 1301b45 into master Nov 28, 2024
11 checks passed
@github-actions github-actions bot added this to the 1.56.0 milestone Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants