-
Notifications
You must be signed in to change notification settings - Fork 2.2k
fix: llama.cpp integration model load and chat experience #5823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed everything up to 4c482ad in 1 minute and 59 seconds. Click for details.
- Reviewed
454
lines of code in12
files - Skipped
0
files when reviewing. - Skipped posting
6
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. web-app/src/containers/ChatInput.tsx:161
- Draft comment:
Stop streaming callback no longer calls stopAllModels(). Ensure this removal is intentional and that aborting streaming doesn’t require unloading all models. - Reason this comment was not posted:
Comment looked like it was already resolved.
2. web-app/src/containers/DownloadManegement.tsx:183
- Draft comment:
The toast for download completion now uses the parameter '{{item}}' instead of '{{modelId}}'. Verify that the event payloads and translation keys are updated consistently throughout the code. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the author to verify that event payloads and translation keys are updated consistently, which is against the rules. It doesn't provide a specific suggestion or point out a clear issue.
3. web-app/src/hooks/useChat.ts:402
- Draft comment:
Good use of abortController signal check before setting the model load error. This prevents error messages when a user aborts requests. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
This comment is purely informative, as it praises the use of abortController signal check without suggesting any changes or improvements. It doesn't align with the rules for useful comments.
4. web-app/src/lib/completion.ts:185
- Draft comment:
Passing the abortController to engine.chat and to the fetch (via tokenJS) is a proper integration for cancellation. Consider additional error handling in case the abort event is triggered. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% The comment has two parts: 1) Confirming the abort integration is good (which is not useful as a comment), and 2) Suggesting additional error handling. While error handling for aborts could be valuable, we don't know what error handling already exists in the engine.chat() implementation. The comment is speculative without seeing that implementation. I might be too harsh - error handling for aborts is generally good practice. And the comment is specifically about the changed code. However, without seeing the engine implementation, we can't be certain that additional error handling is needed. The comment is speculative and asks the author to "consider" something rather than pointing out a definite issue. The comment should be deleted because it is partly congratulatory (which isn't useful) and partly speculative about potential improvements without clear evidence they are needed.
5. web-app/src/locales/de-DE/common.json:251
- Draft comment:
Translation update: Changed description key for downloadComplete to use '{{item}}' instead of '{{modelId}}'. Confirm that the backend and frontend are updated to pass 'item'. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the author to confirm that both the backend and frontend are updated to pass 'item'. This is a request for confirmation, which violates the rule against asking the PR author to confirm their intention or ensure behavior is intended.
6. core/src/browser/extensions/engines/AIEngine.ts:171
- Draft comment:
Typo: The comment on this line is missing a space after the slashes. It would improve readability and consistency with the comment on line 172 to have a space, e.g.,// name of the model
. - Reason this comment was not posted:
Comment was on unchanged code.
Workflow ID: wflow_dtuqC3k5yA9uiQgx
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
Describe Your Changes
This PR addresses a few issues in the llama.cpp extension related to backend downloading, model loading, and generation management.
Self Checklist
Important
This PR improves llama.cpp integration by adding an abort mechanism for streaming requests, ensuring backend readiness before model loading, and preventing backend download from blocking the UI.
AIEngine.ts
andindex.ts
to stop message generation immediately upon user cancellation.index.ts
by implementingensureBackendReady()
.index.ts
by configuring backends asynchronously.stopAllModels()
call fromChatInput.tsx
to avoid unnecessary model unloading.{{item}}
instead of{{modelId}}
.AIEngine.ts
(removal of semicolons).This description was created by
for 4c482ad. You can customize this summary. It will automatically update as commits are pushed.