mit-han-lab / deepcompressor Public

Notifications You must be signed in to change notification settings
Fork 23
Star 316

Code
Issues 24
Pull requests 1
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: mit-han-lab/deepcompressor

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

24 Open 15 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Rationale behind converting proj_out of FluxSingleTransformerBlock to ConcatLinear question

Further information is requested

svdquant

#43 opened Jan 29, 2025 by vinovo

Doesn't work with llama3.1 8b

#38 opened Dec 21, 2024 by ShuaiShao93

Can SVDQuant apply to LLM?

#36 opened Dec 11, 2024 by lixcli

'NoneType' object has no attribute 'name' bug

Something isn't working

svdquant

#34 opened Dec 6, 2024 by jhss

OOM when use deepcompressor quantize llama2 w4a8 per-group with H100 80G

#33 opened Dec 3, 2024 by Andy0422

error in trt_llm conversion

#32 opened Dec 3, 2024 by Andy0422

use qserve with tensorrt-llm raise an error

#31 opened Nov 27, 2024 by anaivebird

2 of 4 tasks

torch.OutOfMemoryError: CUDA out of memory bug

Something isn't working

svdquant

#30 opened Nov 26, 2024 by Lenan22

AttributeError: 'tuple' object has no attribute 'shape' bug

Something isn't working

svdquant

#29 opened Nov 26, 2024 by Lenan22

Quantized custom flux model was still bfloat16 enhancement

New feature or request

svdquant

#27 opened Nov 20, 2024 by samedii

How to apply SVDQuant for SD3 model? enhancement

New feature or request

svdquant

#26 opened Nov 14, 2024 by wxsms

Quantization of Other Models

#25 opened Nov 12, 2024 by Oguzhanercan

What is the function 'patch_attention' for?

#20 opened Oct 11, 2024 by mxjmtxrm

support for W4A8KV8/16 and other models

#19 opened Sep 30, 2024 by KKwanhee

will you support quantize the embedding layer and lm_head layer?

#18 opened Sep 6, 2024 by geqian-9192

[Bug] RuntimeError: Boolean value of Tensor with more than one value is ambiguous

#17 opened Aug 10, 2024 by ChenMnZ

GPTQ LLAMA 2 7B Question

#16 opened Aug 8, 2024 by XiaohanFei

Why Rotate Again in main.py

#15 opened Jul 30, 2024 by RanchiZhao

MLA supported

#14 opened Jul 17, 2024 by RanchiZhao

Group shape error

#13 opened Jul 5, 2024 by LuckyLYM

Questions about rotation

#12 opened Jun 28, 2024 by Kyeong-Joong

Unable to reproduce RTN results in paper

#8 opened Jun 7, 2024 by Golden-Wang

QoQ-g128 Llama3-70B-Instruct Results

#7 opened Jun 3, 2024 by ethxnp

AssertionError: The smooth scale contains NaN.

#5 opened Jun 3, 2024 by ethxnp

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly