Add support for users to specify custom request settings, model and optionally provider specific #14535

JonasHelming · 2024-11-26T20:44:42Z

What it does

Add support for users to specify custom request settings, model and optionally provider specific.
The reason for making them provider specific is that providers have different options and sometimes even different names for the same (see below)

How to test

Add the settings below and adapt it to models you have.
Qwen/Qwen2.5-Coder-32B-Instruct is always "warm" on serverless Hugginface
Starcoder2.3B can be downloaded here: https://huggingface.co/Mozilla/starcoder2-llamafile/tree/main
gemma2 can be directly downloaded with ollama (ollama serve and ollama run gemma2)

Two good test cases:

set the max_length to something small and observe the model to stop
set a stop token and ask the model to say it :-)

"ai-features.modelSettings.requestSettings": [
        {
            "modelId": "Qwen/Qwen2.5-Coder-32B-Instruct",
            "requestSettings": {
                "max_new_tokens": 2048,
                "stop": [
                    "<|im_end|>",
                ]
            },
            "providerId": "huggingface"
        },
        {
            "modelId": "gemma2",
            "requestSettings": {
                "num_predict": 1024,
                "stop": [
                    "<|endoftext|>"
                ],
            },
            "providerId": "ollama"
        },
        {
            "modelId": "StarCoder2.3B",
            "requestSettings": {
                "n_predict": 200,
                "stream": true,
                "stop": [
                    "<file_sep>",
                    "<|endoftext|>"
                ],
                "cache_prompt": true,
            },
            "providerId": "llamafile"
        },
        {
            "modelId": "gpt-4o-2024-05-13",
            "requestSettings": {
                "max_tokens": 10,
                "stop": [
                    "<|im_end|>"
                ]
            },
            "providerId": "openai"
        },
    ]

Follow-ups

This should be the last thing we add to the provider layer before we refactor it all together:

Make the settings more concistent
Remove a lot of code duplication

Review checklist

As an author, I have thoroughly tested my changes and carefully followed the review guidelines

Reminder for reviewers

As a reviewer, I agree to behave in accordance with the review guidelines

fixed #14503 Signed-off-by: Jonas Helming <[email protected]>

fixed #14526 Signed-off-by: Jonas Helming <[email protected]>

JonasHelming · 2024-11-26T20:45:12Z

@dhuebner Could you check the Ollama adaptations please?

Signed-off-by: Jonas Helming <[email protected]>

JonasHelming · 2024-11-26T23:08:15Z

See this for docu: eclipse-theia/theia-website#662

dhuebner · 2024-11-27T07:42:02Z

@JonasHelming
Sure, will do.

dhuebner · 2024-11-27T10:51:09Z

@JonasHelming
Ollama works as expected!

@sdirix
I have a question to another topic. Would it be possible to mark a LanguageModelRequest with an explicit flag that tells if it's a chat or a completion request? Having this information model implementors could decide if they should use stream or text response.

JonasHelming · 2024-11-27T11:17:38Z

@sdirix
I have a question to another topic. Would it be possible to mark a LanguageModelRequest with an explicit flag that tells if it's a chat or a completion request? Having this information model implementors could decide if they should use stream or text response.

@dhuebner I also already thought about this, would you mind creating a new ticket and mark me there?

sdirix

Thanks for the work ❤️ ! I found some inconsistencies which should be fixed before we merge.

packages/ai-llamafile/src/browser/llamafile-frontend-application-contribution.ts

packages/ai-llamafile/src/common/llamafile-manager.ts

packages/ai-llamafile/src/browser/llamafile-frontend-application-contribution.ts

packages/ai-openai/src/browser/openai-frontend-application-contribution.ts

packages/ai-openai/src/node/openai-language-models-manager-impl.ts

…l.ts Co-authored-by: Stefan Dirix <[email protected]>

Signed-off-by: Jonas Helming <[email protected]>

packages/ai-openai/src/browser/openai-frontend-application-contribution.ts

Signed-off-by: Jonas Helming <[email protected]>

JonasHelming · 2024-11-28T00:24:52Z

Thank you for the great review. I tried to address all comments (in individual commits) and tested all providers again with the final state.

sdirix

Works for me. I tested with all model integrations we offer.

Thanks for the great work ❤️

JonasHelming added 2 commits November 26, 2024 11:29

Fix request settings and stop words in HF provider

359f3be

fixed #14503 Signed-off-by: Jonas Helming <[email protected]>

Allow specifying request settings per model

ee6854a

fixed #14526 Signed-off-by: Jonas Helming <[email protected]>

JonasHelming requested a review from sdirix November 26, 2024 20:44

Merge branch 'master' into GH-14526

1114ad8

JonasHelming changed the title ~~Gh 14526~~ Add support for users to specify custom request settings, model and optionally provider specific Nov 26, 2024

JonasHelming added 2 commits November 26, 2024 22:13

Fix linting error

aaeaed7

Signed-off-by: Jonas Helming <[email protected]>

Fix settings description

5d73685

Signed-off-by: Jonas Helming <[email protected]>

sdirix requested changes Nov 27, 2024

View reviewed changes

JonasHelming and others added 4 commits November 27, 2024 12:48

Update packages/ai-openai/src/node/openai-language-models-manager-imp…

65d3b7d

…l.ts Co-authored-by: Stefan Dirix <[email protected]>

Renamed provider IDs

fb96baf

Signed-off-by: Jonas Helming <[email protected]>

Do not allow defining request settings in custom models directly

9e1cb3e

Signed-off-by: Jonas Helming <[email protected]>

Remove check for equal default settings when adding new custom models.

e4892b6

Signed-off-by: Jonas Helming <[email protected]>

sdirix reviewed Nov 27, 2024

View reviewed changes

packages/ai-openai/src/browser/openai-frontend-application-contribution.ts Outdated Show resolved Hide resolved

JonasHelming added 2 commits November 27, 2024 23:08

Fix request settings getting ignored for custom ai models

194b89b

Signed-off-by: Jonas Helming <[email protected]>

Fixed llamafileentry type comment

03a836d

Signed-off-by: Jonas Helming <[email protected]>

JonasHelming requested a review from sdirix November 28, 2024 00:24

Update packages/ai-ollama/src/node/ollama-language-model.ts

76696fe

sdirix approved these changes Nov 28, 2024

View reviewed changes

JonasHelming merged commit 1301b45 into master Nov 28, 2024
11 checks passed

github-actions bot added this to the 1.56.0 milestone Nov 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for users to specify custom request settings, model and optionally provider specific #14535

Add support for users to specify custom request settings, model and optionally provider specific #14535

JonasHelming commented Nov 26, 2024 •

edited

Loading

JonasHelming commented Nov 26, 2024

JonasHelming commented Nov 26, 2024

dhuebner commented Nov 27, 2024

dhuebner commented Nov 27, 2024 •

edited

Loading

JonasHelming commented Nov 27, 2024

sdirix left a comment

JonasHelming commented Nov 28, 2024

sdirix left a comment

Add support for users to specify custom request settings, model and optionally provider specific #14535

Add support for users to specify custom request settings, model and optionally provider specific #14535

Conversation

JonasHelming commented Nov 26, 2024 • edited Loading

What it does

How to test

Follow-ups

Review checklist

Reminder for reviewers

JonasHelming commented Nov 26, 2024

JonasHelming commented Nov 26, 2024

dhuebner commented Nov 27, 2024

dhuebner commented Nov 27, 2024 • edited Loading

JonasHelming commented Nov 27, 2024

sdirix left a comment

Choose a reason for hiding this comment

JonasHelming commented Nov 28, 2024

sdirix left a comment

Choose a reason for hiding this comment

JonasHelming commented Nov 26, 2024 •

edited

Loading

dhuebner commented Nov 27, 2024 •

edited

Loading