Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM when use InternVL2_5-1B-MPO #3143

Open
BobHo5474 opened this issue Feb 14, 2025 · 5 comments
Open

OOM when use InternVL2_5-1B-MPO #3143

BobHo5474 opened this issue Feb 14, 2025 · 5 comments

Comments

@BobHo5474
Copy link

I followed the installation guide to build mldeploy (0.7.0.post3) from source.
Inference using the PyTorch engine works fine.
However, after quantizing the model to 4-bit using AWQ, I encountered an OOM error when loading the model with the TurboMind engine.
I try to set "session_len=2048" in TurbomindEngineConfig.

@lvhan028
Copy link
Collaborator

Can you share the following information?

  • running lmdeploy check_env
  • the reproducible code

@BobHo5474
Copy link
Author

I will get the error when I run lmdeploy check_env because I built lmdeploy on Jetson Orin.

Below is the code,
from lmdeploy import pipeline, TurbomindEngineConfig, PytorchEngineConfig
pipe = pipeline("./InternVL2_5-1B-MPO-4bit/", backend_config=TurbomindEngineConfig(model_format="awq", session_len=2048))

I run lmdeploy lite auto_awq OpenGVLab/InternVL2_5-1B-MPO --work-dir InternVL2_5-1B-MPO-4bit to quantize model.

@lvhan028
Copy link
Collaborator

Can you help open INFO log level? Let's check what the log indicates

from lmdeploy import pipeline, TurbomindEngineConfig, PytorchEngineConfig
pipe = pipeline("./InternVL2_5-1B-MPO-4bit/", backend_config=TurbomindEngineConfig(model_format="awq",session_len=2048), log_level='INFO')

@lvhan028
Copy link
Collaborator

What's the mem size of jetson orin?

@BobHo5474
Copy link
Author

GPU memory size is 16GB, and I uploaded the log file. log.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants