models: lazy-load OpenCV in Nemotron Parse to avoid import-time dependency on X11 libs#33851
Conversation
…dency on X11 libs Signed-off-by: greg pereira <grpereir@redhat.com>
There was a problem hiding this comment.
Code Review
This pull request correctly implements lazy loading for OpenCV in the Nemotron Parse model, which is a valuable improvement for users in minimal container environments. By deferring the import of cv2 until it's actually needed for image processing, you've removed a hard dependency on X11 libraries at startup. My review includes one suggestion to refactor the code to avoid duplication of the cv2 import logic, which will improve the code's maintainability.
| try: | ||
| import cv2 | ||
| except ImportError as err: | ||
| raise ImportError( | ||
| "The package `opencv-python` (cv2) is required to use " | ||
| "NemotronParse model. Please install it with `pip install " | ||
| "opencv-python`. Note that OpenCV may also require system-level " | ||
| "dependencies such as libxcb.so.1 on Linux systems." | ||
| ) from err |
There was a problem hiding this comment.
This try-except block for importing cv2 is a duplicate of the one in _ensure_transforms_initialized. This code duplication can lead to maintenance issues where one block is updated but the other is not.
To avoid this, you can cache the imported cv2 module as an instance attribute. Here's a suggested refactoring:
-
In
NemotronParseImageProcessor.__init__, addself._cv2 = None. -
In
_ensure_transforms_initialized, modify thecv2import block to cache the module:# In _ensure_transforms_initialized() try: import cv2 self._cv2 = cv2 except ImportError as err: # ... error handling # ... later in the same method, use self._cv2 self.transform = A.Compose( [ A.PadIfNeeded( # ... border_mode=self._cv2.BORDER_CONSTANT, # ... ), ] )
-
Then, in this method (
_resize_with_aspect_ratio), you can remove this duplicated block and just use the cached module:# In _resize_with_aspect_ratio() # The _ensure_transforms_initialized method is called in preprocess() # before this method, so self._cv2 should be available. cv2 = self._cv2 # ... return cv2.resize(...)
This approach centralizes the import logic and makes the code cleaner and easier to maintain.
|
Hi @Gregory-Pereira, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: greg pereira <grpereir@redhat.com>
|
Appologies, dupe of #33189 |
Fix: Make OpenCV optional dependency for Nemotron Parse
Problem
The Nemotron Parse model imports
cv2(OpenCV) during processor initialization, which happens during vLLM's startup. This forces all vLLM users to have OpenCV and its system-level dependencies (e.g.,libxcb.so.1on Linux) installed, even when running text-only or non-Nemotron workloads.In minimal container environments without X11 libraries, this causes import-time failures when starting vLLM.
Solution
This PR makes OpenCV a truly optional dependency by implementing lazy loading:
NemotronParseImageProcessor.__init__()to defer transform creation by setting attributes toNone_create_transforms()to_ensure_transforms_initialized()with lazy cv2 import inside try-except blockspreprocess()now calls_ensure_transforms_initialized(), importing cv2 only when actually processing imagesImpact
Before this change, cv2 was imported whenever
NemotronParseImageProcessorwas instantiated. Since vLLM loads model processors during startup, this meant every vLLM deployment needed OpenCV and its system dependencies (like libxcb on Linux), even if you were just running text models.After this change, cv2 is only imported when you actually call
preprocess()to process images. This means:Verification
I ran a test to show when cv2 gets imported. The test analyzes the
NemotronParseImageProcessorcode to see if cv2 is imported during processor creation or deferred until actual use.BEFORE (main branch)
AFTER (this PR)
Impact:
• vLLM can start in minimal containers without OpenCV
• Text-only workloads don't need X11 libraries
• Nemotron Parse still works when OpenCV is installed
Credit for @maugustosilva for finding this, he got this off our llm-d v0.5 release image which used vLLM v0.14.1 with LMcache 0.3.13, and produced the following error: