-
Notifications
You must be signed in to change notification settings - Fork 247
merge upstream #622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
merge upstream #622
Changes from all commits
Commits
Show all changes
46 commits
Select commit
Hold shift + click to select a range
3989c17
add timeout feature
overcuriousity c34372c
implement first draft of new feature
overcuriousity 29ef364
proxy/config: fix RPC endpoint parsing on Windows
overcuriousity ac074d1
fix unit test
overcuriousity c8f2761
rework web interface
overcuriousity 6f023c7
fix error assumption healthy
overcuriousity c17df42
proxy: make RPC health checks independent of process state
overcuriousity 4987daf
WIP: web config changes
overcuriousity e6f9f9a
proxy: fix requestTimeout feature to actually terminate requests
overcuriousity 0e86bbc
docs: add requestTimeout to README features list
overcuriousity 97976a6
Merge pull request #10 from overcuriousity/feat--web-config
overcuriousity fc33fdf
Merge pull request #11 from overcuriousity/feat--timeout
overcuriousity 88f02d7
Merge branch 'new-features' into feat--conditional-rpc-healthcheck
overcuriousity 7187493
Merge pull request #12 from overcuriousity/feat--conditional-rpc-heal…
overcuriousity fe96ae4
proxy: improve RPC health check reliability and fix security issues
79332e3
ui-svelte: improve Config editor dark mode styling
26d7c89
proxy: fix stopCommand hang on startup timeout
5f31c89
proxy/config: fix RPC endpoint parsing for Windows quoted args
7ca1977
remove test config file
9ab8bd8
Merge branch 'mostlygeek:main' into feat--web-config
overcuriousity e762485
Merge branch 'mostlygeek:main' into feat--conditional-rpc-healthcheck
overcuriousity 15a6aa7
Merge branch 'mostlygeek:main' into new-features
overcuriousity 6c14013
proxy: fix data race and startup interrupt hang
59db9f0
ui-svelte: fix Config editor compartment collision and error handling
4e14a0d
proxy: fix race conditions in Stop and test assertions
a502ebd
proxy: fix Windows timeout command conflict
b733ee4
proxy: fix request timeout context handling
60f599b
Merge branch 'new-features' into feat--web-config
overcuriousity 04a8886
Merge pull request #15 from overcuriousity/feat--web-config
overcuriousity a7aa251
Merge pull request #14 from overcuriousity/feat--timeout
overcuriousity f4fd37f
Merge pull request #13 from overcuriousity/feat--conditional-rpc-heal…
overcuriousity 960e78d
Merge branch 'mostlygeek:main' into feat--timeout
overcuriousity febbe97
proxy: ignore I/O timeout in RPC health checks
79cf3df
Merge pull request #16 from overcuriousity/feat--conditional-rpc-heal…
overcuriousity 8e62ce1
ui-svelte: fix Config editor cursor jumping on input
ceeebbc
Merge pull request #17 from overcuriousity/feat--web-config
overcuriousity 7d68a64
Merge branch 'mostlygeek:main' into feat--timeout
overcuriousity 256e576
Merge pull request #21 from mostlygeek/main
overcuriousity 5c9069d
Merge pull request #24 from overcuriousity/feat--timeout
overcuriousity cbaa55d
Merge PR #23: feat--conditional-rpc-healthcheck
f5e13d2
test: fix NewProcess calls to include context parameter
f7b80b4
Merge remote-tracking branch 'origin/feat--web-config'
overcuriousity 842f966
fix: correct node_modules path in Makefile (ui-svelte not ui)
overcuriousity 65761bd
fix: make node_modules depend on package.json to trigger npm install …
overcuriousity 427fe4e
Merge upstream/main into new-features
overcuriousity e295d15
Merge new-features into main
overcuriousity File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -280,6 +280,16 @@ models: | |||||
| # - recommended to be omitted and the default used | ||||||
| concurrencyLimit: 0 | ||||||
|
|
||||||
| # requestTimeout: maximum time in seconds for a single request to complete | ||||||
| # - optional, default: 0 (no timeout) | ||||||
| # - useful for preventing runaway inference processes that never complete | ||||||
| # - when exceeded, the model process is forcefully stopped | ||||||
| # - protects against GPU overheating and blocking from stuck processes | ||||||
| # - the process must be restarted for the next request | ||||||
| # - set to 0 to disable timeout | ||||||
| # - recommended for models that may have infinite loops or excessive generation | ||||||
| requestTimeout: 0 # disabled by default, set to e.g., 300 for 5 minutes | ||||||
|
|
||||||
| # sendLoadingState: overrides the global sendLoadingState setting for this model | ||||||
| # - optional, default: undefined (use global setting) | ||||||
| sendLoadingState: false | ||||||
|
|
@@ -293,6 +303,24 @@ models: | |||||
| unlisted: true | ||||||
| cmd: llama-server --port ${PORT} -m Llama-3.2-1B-Instruct-Q4_K_M.gguf -ngl 0 | ||||||
|
|
||||||
| # RPC health check example for distributed inference: | ||||||
| "qwen-distributed": | ||||||
| # rpcHealthCheck: enable TCP health checks for RPC endpoints | ||||||
| # - optional, default: false | ||||||
| # - when enabled, parses --rpc host:port[,host:port,...] from cmd | ||||||
| # - performs TCP connectivity checks every 30 seconds | ||||||
|
||||||
| # - performs TCP connectivity checks every 30 seconds | |
| # - performs TCP connectivity checks every 10 seconds |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| package main | ||
|
|
||
| import ( | ||
| "bytes" | ||
| _ "embed" | ||
| ) | ||
|
|
||
| //go:embed config.example.yaml | ||
| var configExampleYAML []byte | ||
|
|
||
| // GetConfigExampleYAML returns the embedded example config file | ||
| func GetConfigExampleYAML() []byte { | ||
| return bytes.Clone(configExampleYAML) | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Schema description says RPC health checks run "every 30 seconds", but the current implementation introduced in
proxy/process.gouses a 10-second ticker. Please update the schema description (or the code) so they match.