Skip to content

models: lazy-load OpenCV in Nemotron Parse to avoid import-time dependency on X11 libs#33851

Closed
Gregory-Pereira wants to merge 2 commits intovllm-project:mainfrom
Gregory-Pereira:fix/lazy-cv2-import-nemotron-parse
Closed

models: lazy-load OpenCV in Nemotron Parse to avoid import-time dependency on X11 libs#33851
Gregory-Pereira wants to merge 2 commits intovllm-project:mainfrom
Gregory-Pereira:fix/lazy-cv2-import-nemotron-parse

Conversation

@Gregory-Pereira
Copy link
Copy Markdown
Contributor

@Gregory-Pereira Gregory-Pereira commented Feb 5, 2026

Fix: Make OpenCV optional dependency for Nemotron Parse

Problem

The Nemotron Parse model imports cv2 (OpenCV) during processor initialization, which happens during vLLM's startup. This forces all vLLM users to have OpenCV and its system-level dependencies (e.g., libxcb.so.1 on Linux) installed, even when running text-only or non-Nemotron workloads.

In minimal container environments without X11 libraries, this causes import-time failures when starting vLLM.

Solution

This PR makes OpenCV a truly optional dependency by implementing lazy loading:

  1. Modified NemotronParseImageProcessor.__init__() to defer transform creation by setting attributes to None
  2. Renamed _create_transforms() to _ensure_transforms_initialized() with lazy cv2 import inside try-except blocks
  3. preprocess() now calls _ensure_transforms_initialized(), importing cv2 only when actually processing images
  4. Added clear error messages when cv2 is missing, explaining installation requirements

Impact

Before this change, cv2 was imported whenever NemotronParseImageProcessor was instantiated. Since vLLM loads model processors during startup, this meant every vLLM deployment needed OpenCV and its system dependencies (like libxcb on Linux), even if you were just running text models.

After this change, cv2 is only imported when you actually call preprocess() to process images. This means:

  • You can run vLLM in minimal containers without OpenCV for text-only or non-Nemotron workloads
  • If you do try to use Nemotron Parse without OpenCV installed, you get a clear error message explaining what's needed
  • Existing Nemotron Parse functionality works exactly the same when OpenCV is available

Verification

I ran a test to show when cv2 gets imported. The test analyzes the NemotronParseImageProcessor code to see if cv2 is imported during processor creation or deferred until actual use.

BEFORE (main branch)

Analyzing NemotronParseImageProcessor.__init__()...

BEFORE (main branch):
   __init__() calls _create_transforms()
   → cv2 will be imported when processor is created
   → ALL vLLM users need OpenCV, even for text workloads

Impact:
  • ALL vLLM users must have OpenCV installed
  • Minimal containers need X11 libraries (libxcb, etc.)
  • Fails at startup without OpenCV, even for text workloads

AFTER (this PR)

Analyzing NemotronParseImageProcessor.__init__()...

AFTER (fix branch):
   __init__() sets self.transform = None
   → cv2 is NOT imported during processor creation
   → cv2 only imported when preprocess() is called
   → vLLM can start WITHOUT OpenCV for text workloads

   Lazy initialization via _ensure_transforms_initialized():
   - Imports cv2 only when called
   - Called by preprocess() when actually processing images
   - Not called during __init__()

Impact:
• vLLM can start in minimal containers without OpenCV
• Text-only workloads don't need X11 libraries
• Nemotron Parse still works when OpenCV is installed

Credit for @maugustosilva for finding this, he got this off our llm-d v0.5 release image which used vLLM v0.14.1 with LMcache 0.3.13, and produced the following error:

k logs -f facebook-28bb2887-opt-125m-decode-65f9c58c46-kqqz5
Defaulted container "vllm" out of: vllm, routing-proxy (init)
(APIServer pid=1) INFO 02-04 21:34:43 [api_server.py:1272] vLLM API server version 0.14.1
(APIServer pid=1) INFO 02-04 21:34:43 [utils.py:263] non-default args: {'model_tag': '/model-cache/models/facebook/opt-125m', 'port': 8200, 'model': '/model-cache/models/facebook/opt-125m', 'max_model_len': 16384,
 'served_model_name': ['facebook/opt-125m']}
(APIServer pid=1) INFO 02-04 21:34:53 [model.py:530] Resolved architecture: OPTForCausalLM
(APIServer pid=1) WARNING 02-04 21:34:53 [model.py:2020] User-specified max_model_len (16384) is greater than the derived max_model_len (max_position_embeddings=2048.0 or model_max_length=None in model's config.js
on). VLLM_ALLOW_LONG_MAX_MODEL_LEN must be used with extreme caution. If the model uses relative position encoding (RoPE), positions exceeding derived_max_model_len lead to nan. If the model uses absolute position
 encoding, positions exceeding derived_max_model_len will cause a CUDA array out-of-bounds error.
(APIServer pid=1) INFO 02-04 21:34:53 [model.py:1545] Using max model len 16384
(APIServer pid=1) INFO 02-04 21:34:54 [scheduler.py:229] Chunked prefill is enabled with max_num_batched_tokens=8192.
(APIServer pid=1) INFO 02-04 21:34:54 [vllm.py:630] Asynchronous scheduling is enabled.
(APIServer pid=1) INFO 02-04 21:34:54 [vllm.py:637] Disabling NCCL for DP synchronization when using async scheduling.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib64/python3.12/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/usr/lib64/python3.12/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/usr/lib64/python3.12/multiprocessing/spawn.py", line 297, in _fixup_main_from_path  
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 280, in run_path
  File "/usr/lib64/python3.12/pkgutil.py", line 170, in <module>
    iter_importer_modules.register(
  File "/usr/lib64/python3.12/functools.py", line 895, in register
    if _is_union_type(cls):
       ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/functools.py", line 845, in _is_union_type
    from typing import get_origin, Union
  File "/opt/vllm/lib/python3.12/site-packages/cv2/typing/__init__.py", line 62, in <module>
    import cv2.mat_wrapper
ImportError: libxcb.so.1: cannot open shared object file: No such file or directory
(APIServer pid=1) Traceback (most recent call last):
(APIServer pid=1)   File "/opt/vllm/bin/vllm", line 10, in <module>
(APIServer pid=1)     sys.exit(main())
(APIServer pid=1)              ^^^^^^
(APIServer pid=1)   File "/opt/vllm-source/vllm/entrypoints/cli/main.py", line 73, in main
(APIServer pid=1)     args.dispatch_function(args)
(APIServer pid=1)   File "/opt/vllm-source/vllm/entrypoints/cli/serve.py", line 60, in cmd
(APIServer pid=1)     uvloop.run(run_server(args))
(APIServer pid=1)   File "/opt/vllm/lib64/python3.12/site-packages/uvloop/__init__.py", line 96, in run
(APIServer pid=1)     return __asyncio.run(
(APIServer pid=1)            ^^^^^^^^^^^^^^
(APIServer pid=1)   File "/usr/lib64/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=1)     return runner.run(main)
(APIServer pid=1)            ^^^^^^^^^^^^^^^^
(APIServer pid=1)   File "/usr/lib64/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=1)     return self._loop.run_until_complete(task)
(APIServer pid=1)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=1)   File "/opt/vllm/lib64/python3.12/site-packages/uvloop/__init__.py", line 48, in wrapper
(APIServer pid=1)     return await main
(APIServer pid=1)            ^^^^^^^^^^
(APIServer pid=1)   File "/opt/vllm-source/vllm/entrypoints/openai/api_server.py", line 1319, in run_server
(APIServer pid=1)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=1)   File "/opt/vllm-source/vllm/entrypoints/openai/api_server.py", line 1338, in run_server_worker
(APIServer pid=1)     async with build_async_engine_client(
(APIServer pid=1)                ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1)   File "/usr/lib64/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=1)     return await anext(self.gen)
(APIServer pid=1)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1)   File "/opt/vllm-source/vllm/entrypoints/openai/api_server.py", line 173, in build_async_engine_client
(APIServer pid=1)     async with build_async_engine_client_from_engine_args(
(APIServer pid=1)                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1)   File "/usr/lib64/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=1)     return await anext(self.gen)
(APIServer pid=1)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1)   File "/opt/vllm-source/vllm/entrypoints/openai/api_server.py", line 214, in build_async_engine_client_from_engine_args
(APIServer pid=1)     async_llm = AsyncLLM.from_vllm_config(
(APIServer pid=1)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1)   File "/opt/vllm-source/vllm/v1/engine/async_llm.py", line 205, in from_vllm_config
(APIServer pid=1)     return cls(
(APIServer pid=1)            ^^^^
(APIServer pid=1)   File "/opt/vllm-source/vllm/v1/engine/async_llm.py", line 132, in __init__
(APIServer pid=1)     self.engine_core = EngineCoreClient.make_async_mp_client(
(APIServer pid=1)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1)   File "/opt/vllm-source/vllm/v1/engine/core_client.py", line 122, in make_async_mp_client
(APIServer pid=1)     return AsyncMPClient(*client_args)
(APIServer pid=1)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1)   File "/opt/vllm-source/vllm/v1/engine/core_client.py", line 824, in __init__
(APIServer pid=1)     super().__init__(
(APIServer pid=1)   File "/opt/vllm-source/vllm/v1/engine/core_client.py", line 479, in __init__
(APIServer pid=1)     with launch_core_engines(vllm_config, executor_class, log_stats) as (
(APIServer pid=1)          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1)   File "/usr/lib64/python3.12/contextlib.py", line 144, in __exit__
(APIServer pid=1)     next(self.gen)
(APIServer pid=1)   File "/opt/vllm-source/vllm/v1/engine/utils.py", line 921, in launch_core_engines
(APIServer pid=1)     wait_for_engine_startup(
(APIServer pid=1)   File "/opt/vllm-source/vllm/v1/engine/utils.py", line 980, in wait_for_engine_startup
(APIServer pid=1)     raise RuntimeError(
(APIServer pid=1) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {'EngineCore_DP0': 1}

…dency on X11 libs

Signed-off-by: greg pereira <grpereir@redhat.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly implements lazy loading for OpenCV in the Nemotron Parse model, which is a valuable improvement for users in minimal container environments. By deferring the import of cv2 until it's actually needed for image processing, you've removed a hard dependency on X11 libraries at startup. My review includes one suggestion to refactor the code to avoid duplication of the cv2 import logic, which will improve the code's maintainability.

Comment on lines +455 to +463
try:
import cv2
except ImportError as err:
raise ImportError(
"The package `opencv-python` (cv2) is required to use "
"NemotronParse model. Please install it with `pip install "
"opencv-python`. Note that OpenCV may also require system-level "
"dependencies such as libxcb.so.1 on Linux systems."
) from err
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This try-except block for importing cv2 is a duplicate of the one in _ensure_transforms_initialized. This code duplication can lead to maintenance issues where one block is updated but the other is not.

To avoid this, you can cache the imported cv2 module as an instance attribute. Here's a suggested refactoring:

  1. In NemotronParseImageProcessor.__init__, add self._cv2 = None.

  2. In _ensure_transforms_initialized, modify the cv2 import block to cache the module:

    # In _ensure_transforms_initialized()
    try:
        import cv2
        self._cv2 = cv2
    except ImportError as err:
        # ... error handling
    
    # ... later in the same method, use self._cv2
    self.transform = A.Compose(
        [
            A.PadIfNeeded(
                # ...
                border_mode=self._cv2.BORDER_CONSTANT,
                # ...
            ),
        ]
    )
  3. Then, in this method (_resize_with_aspect_ratio), you can remove this duplicated block and just use the cached module:

    # In _resize_with_aspect_ratio()
    # The _ensure_transforms_initialized method is called in preprocess()
    # before this method, so self._cv2 should be available.
    cv2 = self._cv2
    
    # ...
    
    return cv2.resize(...)

This approach centralizes the import logic and makes the code cleaner and easier to maintain.

@mergify
Copy link
Copy Markdown

mergify bot commented Feb 5, 2026

Hi @Gregory-Pereira, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: greg pereira <grpereir@redhat.com>
@Gregory-Pereira
Copy link
Copy Markdown
Contributor Author

Appologies, dupe of #33189

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant