Skip to content
This repository was archived by the owner on Sep 4, 2025. It is now read-only.

add triton CustomCacheManager#55

Merged
dtrifiro merged 2 commits into
opendatahub-io:mainfrom
dtrifiro:fix-triton-cache-issues
Jun 18, 2024
Merged

add triton CustomCacheManager#55
dtrifiro merged 2 commits into
opendatahub-io:mainfrom
dtrifiro:fix-triton-cache-issues

Conversation

@dtrifiro
Copy link
Copy Markdown

@dtrifiro dtrifiro commented Jun 18, 2024

fixes RHOAIENG-8043

Co-authored-by: Chih-Chieh-Yang chih.chieh.yang@ibm.com
Signed-off-by: Thomas Parnell tpa@zurich.ibm.com

@openshift-ci openshift-ci Bot requested review from rpancham and terrytangyuan June 18, 2024 11:20
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Jun 18, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dtrifiro

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dtrifiro
Copy link
Copy Markdown
Author

cherry-pick of IBM/vllm#35

@dtrifiro dtrifiro force-pushed the fix-triton-cache-issues branch from fa8a6a2 to 097f576 Compare June 18, 2024 11:39
@dtrifiro dtrifiro changed the title add triton CustomCacheManger add triton CustomCacheManager Jun 18, 2024
@dtrifiro dtrifiro force-pushed the fix-triton-cache-issues branch from 097f576 to c935d57 Compare June 18, 2024 12:42
@dtrifiro
Copy link
Copy Markdown
Author

dtrifiro commented Jun 18, 2024

Merge after #56

@dtrifiro dtrifiro force-pushed the fix-triton-cache-issues branch from c935d57 to 26b004e Compare June 18, 2024 13:47
@dtrifiro dtrifiro force-pushed the fix-triton-cache-issues branch from 26b004e to 3aef43e Compare June 18, 2024 15:28
@dtrifiro dtrifiro merged commit c127b61 into opendatahub-io:main Jun 18, 2024
dtrifiro and others added 2 commits June 18, 2024 17:29
fixes RHOAIENG-8043

Co-authored-by: Chih-Chieh-Yang <chih.chieh.yang@ibm.com>
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Xaenalt pushed a commit that referenced this pull request Sep 18, 2024
* Add hpu syncs in model loader to prevent memory peak after loading weights

* Remove spaces

* Fix typo
prarit pushed a commit to prarit/vllm that referenced this pull request Oct 18, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants