Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: llava-hf/llava-interleave-qwen-7b-hf (#2497) #2657

Merged
merged 1 commit into from
Oct 28, 2024

Conversation

deepindeed2022
Copy link
Contributor

Motivation

issue #2497

python3 -m lmdeploy.serve.openai.api_server path/to/llava_hf/llava-interleave-qwen-7b-hf

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/data/willow/Repo/lmdeploy/lmdeploy/serve/openai/api_server.py", line 1376, in <module>
    fire.Fire(serve)
  File "/opt/py38/lib/python3.8/site-packages/fire/core.py", line 143, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/opt/py38/lib/python3.8/site-packages/fire/core.py", line 477, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/opt/py38/lib/python3.8/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/data/willow/Repo/lmdeploy/lmdeploy/serve/openai/api_server.py", line 1333, in serve
    VariableInterface.async_engine = pipeline_class(
  File "/data/willow/Repo/lmdeploy/lmdeploy/serve/vl_async_engine.py", line 21, in __init__
    self.vl_encoder = ImageEncoder(model_path,
  File "/data/willow/Repo/lmdeploy/lmdeploy/vl/engine.py", line 85, in __init__
    self.model = load_vl_model(model_path, backend_config=backend_config)
  File "/data/willow/Repo/lmdeploy/lmdeploy/vl/model/builder.py", line 56, in load_vl_model
    return module(**kwargs)
  File "/data/willow/Repo/lmdeploy/lmdeploy/vl/model/base.py", line 31, in __init__
    self.build_model()
  File "/data/willow/Repo/lmdeploy/lmdeploy/vl/model/llava_hf.py", line 37, in build_model
    load_checkpoint_and_dispatch(
  File "/opt/py38/lib/python3.8/site-packages/accelerate/big_modeling.py", line 604, in load_checkpoint_and_dispatch
    device_map = infer_auto_device_map(
  File "/opt/py38/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 1240, in infer_auto_device_map
    if check_tied_parameters_in_config(model) and len(tied_parameters) == 0:
  File "/opt/py38/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 574, in check_tied_parameters_in_config
    and model.get_output_embeddings()
  File "/opt/py38/lib/python3.8/site-packages/transformers/models/llava/modeling_llava.py", line 260, in get_output_embeddings
    return self.language_model.get_output_embeddings()
AttributeError: 'NoneType' object has no attribute 'get_output_embeddings'

Modification

  • model init attribute error fix
  • add vision-max-batch-size for openai/api_server start config, reference from lmdeploy/cli/serve.py

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

we have test llava_hf/llava-interleave-qwen-7b-hf, and support vision encoder with config in start command. such as:
python3 -m lmdeploy.serve.openai.api_server path/to/llava_hf/llava-interleave-qwen-7b-hf --vision-max-batch-size 16

@@ -1054,13 +1054,15 @@ def serve(model_path: str,

_, pipeline_class = get_task(model_path)

vision_config = VisionConfig(kwargs.get("vision_max_batch_size", 1))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I will use the options to setting max_batch_size

@irexyc
Copy link
Collaborator

irexyc commented Oct 25, 2024

I suggest just set model.config.tie_word_embeddings to False before using load_checkpoint_and_dispatch

- fix init raise exception because tie_word_embeddings config
Copy link
Collaborator

@AllentDan AllentDan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lvhan028
Copy link
Collaborator

@deepindeed2022 please resolve the linting error

pip install pre-commit
cd lmdeploy # the root directory of the repo
pre-commit install
pre-commit run --all-files

@lvhan028 lvhan028 merged commit 39de575 into InternLM:main Oct 28, 2024
4 of 5 checks passed
AllentDan pushed a commit to AllentDan/lmdeploy that referenced this pull request Nov 13, 2024
- fix init raise exception because tie_word_embeddings config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants