-
Notifications
You must be signed in to change notification settings - Fork 2.2k
feat: Add model compatibility check and memory estimation #6243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to 10e85de in 2 minutes and 30 seconds. Click for details.
- Reviewed
70
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
6
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. extensions/llamacpp-extension/src/index.ts:1594
- Draft comment:
The revised error message is more user‐friendly but hides the original backend info. Consider logging cfg.version_backend for debugging. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
2. extensions/llamacpp-extension/src/index.ts:1699
- Draft comment:
Consider improving the doc comment wording: e.g. 'Estimate KVCache size from given metadata' instead of 'estimate KVCache size of from a given metadata'. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
3. extensions/llamacpp-extension/src/index.ts:1597
- Draft comment:
Typo: The error message text says "Please restart the app in a stable internet connection." This phrasing is slightly awkward. Consider replacing it with "Please restart the app with a stable internet connection." for improved clarity. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% This is a very minor grammatical suggestion. While "with" may be slightly more natural than "in", both versions are understandable and convey the same meaning. The comment is about UI text/copy rather than actual code functionality. The change would have minimal impact on user experience. The comment does point out a real, if minor, grammatical improvement. English style guides generally prefer "with a connection" over "in a connection". While technically correct, this level of nitpicking on error message wording is not valuable enough to warrant a PR comment. The existing message is perfectly understandable. Delete this comment. While the suggested change is technically correct, it's too minor to be worth a PR comment. The existing error message is clear enough.
4. extensions/llamacpp-extension/src/index.ts:1696
- Draft comment:
Typo in the comment: "estimate KVCache size of from a given metadata". Consider rephrasing to "Estimate KVCache size from given metadata" or similar. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% While the comment is technically correct about the typo, fixing comment typos is a very minor issue that doesn't affect functionality. The comment is clear enough as is. We should focus on more substantive issues. The rules say not to make purely informative comments or comments about obvious/unimportant things. The typo does make the comment slightly less professional looking. Clear documentation is important for maintainability. While clear documentation is good, this is an extremely minor typo that doesn't significantly impact readability or understanding. The meaning is still clear. Delete this comment as it's too minor of an issue to be worth addressing. The comment is still perfectly understandable despite the small typo.
5. extensions/llamacpp-extension/src/index.ts:1717
- Draft comment:
The comment "can extension by checking cache-type-v and cache-type-k" seems grammatically incorrect. Perhaps it should be "can be extended by checking cache-type-v and cache-type-k". - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% While the grammar correction is technically accurate, code comments are informal documentation and minor grammatical issues don't impact code functionality or understanding. The meaning of the comment is still clear despite the minor grammatical error. This kind of nitpicking about grammar in comments creates noise in the PR review process. The comment is technically correct - there is a grammatical error. And some might argue that maintaining high quality even in comments is important for professionalism. While professionalism is important, focusing on minor grammatical issues in comments distracts from more substantive code review. The original comment's meaning is clear enough for its purpose. Delete this comment as it focuses on a minor grammatical issue that doesn't meaningfully impact code quality or understanding.
6. extensions/llamacpp-extension/src/index.ts:1734
- Draft comment:
Typographical errors in the comment: "model weighst + kvcache" should be corrected to "model weights + KVCache". - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% While the comment correctly identifies typos, fixing typos in comments is a very minor issue that doesn't affect functionality. The comment is technically about changed code since it's in the new section, but it's an extremely low priority change that doesn't meaningfully improve code quality or readability. The typos are real and the comment is technically correct. Consistent capitalization and spelling can make code more professional. While technically correct, this level of nitpicking about comment typos creates noise in the PR review process and distracts from more important issues. The meaning is still clear despite the typos. Delete this comment. While it points out real typos, commenting on minor spelling issues in comments is too trivial and creates unnecessary noise in the PR review process.
Workflow ID: wflow_6kJxH89xcwRwPrSC
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed b6b8f4b in 1 minute and 2 seconds. Click for details.
- Reviewed
40
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. extensions/llamacpp-extension/src/index.ts:1697
- Draft comment:
Improved parameter formatting in estimateKVCache enhances readability. Consider using a more specific type for 'meta' if possible. - Reason this comment was not posted:
Confidence changes required:20%
<= threshold50%
None
2. extensions/llamacpp-extension/src/index.ts:1742
- Draft comment:
The isModelSupported method now gracefully falls back to checking system memory via getSystemUsage when no devices are found. Verify that the memory units (MB) are consistent with assumed conversions. - Reason this comment was not posted:
Confidence changes required:30%
<= threshold50%
None
Workflow ID: wflow_1VPPx8TyqfuCqfQh
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 84b9420 in 1 minute and 25 seconds. Click for details.
- Reviewed
40
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
4
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. extensions/llamacpp-extension/src/index.ts:35
- Draft comment:
New import 'getSystemUsage' added. Ensure its returned memory values are in MB as assumed for correct byte conversion. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
2. extensions/llamacpp-extension/src/index.ts:1696
- Draft comment:
Reformatted 'estimateKVCache' parameters improves readability. No issues found. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
3. extensions/llamacpp-extension/src/index.ts:1741
- Draft comment:
Enhanced isModelSupported fallback now uses system memory via getSystemUsage when no devices are found. Confirm that 'sys.total_memory' and 'sys.used_memory' are in MB (as expected) to ensure accurate conversion. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
4. extensions/llamacpp-extension/src/index.ts:1696
- Draft comment:
Typographical error: The comment 'estimate KVCache size of from a given metadata' appears to have extraneous words. Consider revising it to either 'estimate KVCache size from a given metadata' or an alternative phrasing. - Reason this comment was not posted:
Comment was on unchanged code.
Workflow ID: wflow_kAyKg6jmXy35hmA7
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed 8a2924e in 3 minutes and 22 seconds. Click for details.
- Reviewed
241
lines of code in4
files - Skipped
0
files when reviewing. - Skipped posting
3
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src-tauri/plugins/tauri-plugin-llamacpp/Cargo.toml:26
- Draft comment:
NOTE: The reqwest dependency is added with features ["json", "blocking", "stream"]. Verify that the 'blocking' feature is necessary, as the plugin code uses async requests. - Reason this comment was not posted:
Confidence changes required:50%
<= threshold50%
None
2. src-tauri/plugins/tauri-plugin-llamacpp/src/gguf/commands.rs:10
- Draft comment:
INFO: The remote GGUF metadata retrieval using 2MB range requests is implemented correctly. Consider confirming that target servers support the Range header for reliable parsing. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
3. src-tauri/plugins/tauri-plugin-llamacpp/src/gguf/helpers.rs:7
- Draft comment:
INFO: The GGUF metadata parsing helpers enforce reasonable limits (e.g. string length max 1MB, array length limit) which is good. Verify these limits match expected inputs and security considerations. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
Workflow ID: wflow_vNUYBneaTK6WdxOI
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed ca2d9fc in 3 minutes and 20 seconds. Click for details.
- Reviewed
63
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
0
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
Workflow ID: wflow_KwQCRvcpzkQS9NtN
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
This commit introduces a new feature to check if a given model is supported based on available device memory. The change includes: - A new `estimateKVCache` method that calculates the required memory for the model's KV cache. It uses GGUF metadata such as `block_count`, `head_count`, `key_length`, and `value_length` to perform the calculation. - An `isModelSupported` method that combines the model file size and the estimated KV cache size to determine the total memory required. It then checks if any available device has sufficient free memory to load the model. - An updated error message for the `version_backend` check to be more user-friendly, suggesting a stable internet connection as a potential solution for backend setup failures. This functionality helps prevent the application from attempting to load models that would exceed the device's memory capacity, leading to more stable and predictable behavior. fixes: #5505
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
This commit refactors the logic for checking if a model is supported by a system's available memory. **Key changes:** - **Remote model support**: The `read_gguf_metadata` function can now fetch metadata from a remote URL by reading the file in chunks. - **Improved KV cache size calculation**: The KV cache size is now estimated more accurately by using `attention.key_length` and `attention.value_length` from the GGUF metadata, with a fallback to `embedding_length`. - **Granular memory check statuses**: The `isModelSupported` function now returns a more specific status (`'RED'`, `'YELLOW'`, `'GREEN'`) to indicate whether the model weights or the KV cache are too large for the available memory. - **Consolidated logic**: The logic for checking local and remote models has been consolidated into a single `isModelSupported` function, improving code clarity and maintainability. These changes provide more robust and informative model compatibility checks, especially for models hosted on remote servers.
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed b3ea19c in 2 minutes and 26 seconds. Click for details.
- Reviewed
22
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
5
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. web-app/src/containers/ModelSupportStatus.tsx:80
- Draft comment:
GREEN tooltip message updated to 'Works Well on your device'. This clarifies the message well. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
2. web-app/src/containers/ModelSupportStatus.tsx:82
- Draft comment:
YELLOW tooltip message now says 'Might work on your device', which softens the limitation warning effectively. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
3. web-app/src/containers/ModelSupportStatus.tsx:84
- Draft comment:
RED tooltip text contains an extra space before the parenthesis; consider removing it for consistency. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
4. web-app/src/containers/ModelSupportStatus.tsx:86
- Draft comment:
LOADING tooltip now specifies 'Checking device compatibility...', which improves clarity for the user. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
5. web-app/src/containers/ModelSupportStatus.tsx:84
- Draft comment:
Typographical note: There is an extra space in "Doesn't work on your device (ctx: ${contextSize})". Consider removing one extra space before the opening parenthesis to maintain consistency. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% While the comment is technically correct about the inconsistency, it's an extremely minor formatting issue that doesn't affect functionality. The change being reviewed is about updating tooltip messages to include "on your device", and this extra space is just a tiny typographical issue. Such minor formatting issues often don't warrant a PR comment and can be seen as noise in the review process. The extra space could theoretically cause visual inconsistency in the UI tooltip. Also, maintaining consistent formatting is generally good practice. However, this is such a minor issue that it creates more overhead in the review process than the value it provides. The visual impact of an extra space in a tooltip is negligible. This comment should be deleted as it focuses on an extremely minor formatting issue that doesn't meaningfully impact the code or user experience.
Workflow ID: wflow_lq1Cpv8J9TKhJFvK
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 5ea5ca5 in 2 minutes and 24 seconds. Click for details.
- Reviewed
48
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
5
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. web-app/src/containers/ModelInfoHoverCard.tsx:51
- Draft comment:
Good: Adding 'shrink-0' prevents the spinner from shrinking in the flex container, ensuring consistent size. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
2. web-app/src/containers/ModelInfoHoverCard.tsx:58
- Draft comment:
Added 'shrink-0' on green status indicator to enforce fixed size within flex layout. Looks good. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
3. web-app/src/containers/ModelInfoHoverCard.tsx:67
- Draft comment:
The yellow indicator now includes 'shrink-0', ensuring it doesn’t collapse in flex containers. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
4. web-app/src/containers/ModelInfoHoverCard.tsx:76
- Draft comment:
Red indicator now has 'shrink-0' to maintain consistent sizing in its flex container. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
5. web-app/src/containers/ModelInfoHoverCard.tsx:85
- Draft comment:
The Unknown status indicator now includes 'shrink-0', matching other states for fixed sizing. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
Workflow ID: wflow_bZTgzo4KcYenFR4K
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Describe Your Changes
This commit introduces a new feature to check if a given model is supported based on available device memory.
The change includes:
estimateKVCache
method that calculates the required memory for the model's KV cache. It uses GGUF metadata such asblock_count
,head_count
,key_length
, andvalue_length
to perform the calculation.isModelSupported
method that combines the model file size and the estimated KV cache size to determine the total memory required. It then checks if any available device has sufficient free memory to load the model.version_backend
check to be more user-friendly, suggesting a stable internet connection as a potential solution for backend setup failures.This functionality helps prevent the application from attempting to load models that would exceed the device's memory capacity, leading to more stable and predictable behavior.
Fixes Issues
Self Checklist
Important
Add model compatibility check and memory estimation to ensure models fit within device memory constraints, with UI feedback on compatibility status.
estimateKVCache
andisModelSupported
methods inindex.ts
to calculate memory requirements and check model compatibility.version_backend
check inindex.ts
.ModelSupportStatus
andModelInfoHoverCard
components to display model compatibility status.DropdownModelProvider
andChatInput
to integrate new components.read_gguf_metadata
incommands.rs
to support remote files.Cargo.toml
to includereqwest
dependency.isModelSupported
inmodels.test.ts
.This description was created by
for 5ea5ca5. You can customize this summary. It will automatically update as commits are pushed.