[Misc] Replace urllib's urlparse with urllib3's parse_url#32746
[Misc] Replace urllib's urlparse with urllib3's parse_url#32746DarkLight1337 merged 4 commits intovllm-project:mainfrom
urlparse with urllib3's parse_url#32746Conversation
There was a problem hiding this comment.
Code Review
This pull request correctly replaces urllib.parse.urlparse with urllib3.util.parse_url across the codebase. The changes account for the differences in the return values, particularly how None is handled by urllib3. My review includes one suggestion to improve error handling for malformed data URLs to provide clearer error messages.
| url_spec_path = url_spec.path or "" | ||
| data_spec, data = url_spec_path.split(",", 1) |
There was a problem hiding this comment.
While adding or "" correctly prevents an AttributeError when url_spec.path is None, it can lead to a generic and less informative ValueError if an invalid data URL is provided (e.g., one missing the data part after the comma). It would be more robust to explicitly check for a valid data URL path and raise an error with a more specific message, including the problematic URL.
| url_spec_path = url_spec.path or "" | |
| data_spec, data = url_spec_path.split(",", 1) | |
| if not url_spec.path or "," not in url_spec.path: | |
| raise ValueError(f"Invalid data URL, cannot find data part: {url_spec.unparse()}") | |
| data_spec, data = url_spec.path.split(",", 1) |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
Comment @cursor review or bugbot run to trigger another review on this PR
|
|
||
| if url_spec.scheme.startswith("http"): | ||
| if url_spec.scheme and url_spec.scheme.startswith("http"): |
There was a problem hiding this comment.
Data URLs fail due to urllib3 parse_url scheme detection
High Severity
The switch from urllib.parse.urlparse to urllib3.util.parse_url breaks data URL handling. Data URLs use the data:content format without ://, but parse_url is designed for HTTP URLs and expects scheme://authority/path format. For a data URL like data:image/png;base64,abc, parse_url returns None for the scheme, causing the check if url_spec.scheme == "data" to fail. This results in valid data URLs being rejected with "The URL must be either a HTTP, data or file URL" error, breaking multimodal image/audio/video inputs that use base64 data URLs.
Additional Locations (1)
|
|
||
| parsed = urlparse(port) | ||
| parsed = parse_url(port) |
There was a problem hiding this comment.
Missing exception handling for LocationParseError in port validation
Medium Severity
The get_vllm_port function catches ValueError from int(port) and then calls parse_url(port) to detect URI-like values and provide helpful error messages. However, urllib3.util.parse_url can raise LocationParseError for certain malformed inputs (e.g., invalid IPv6 addresses, overly long labels), while the original urlparse never raised exceptions. This uncaught exception would bypass the helpful error messages and cause an unexpected crash during application startup when VLLM_PORT is misconfigured with certain malformed values.
|
I think the failing test is related |
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
c32ec4a to
2d7e335
Compare
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> (cherry picked from commit 8ebf271)
…roject#32746) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: mohammad najafi <mohammad.najafi@amd.com>
…roject#32746) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: 陈建华 <1647430658@qq.com>
…roject#32746) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
* Replace urllib's `urlparse` with urllib3's `parse_url` (vllm-project#32746) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> (cherry picked from commit 8ebf271) * Bump opencv-python dependecy version to 4.13 (vllm-project#32668) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> (cherry picked from commit 444e2e7) * Fix Whisper/encoder-decoder GPU memory leak (vllm-project#32789) Signed-off-by: NickLucche <nlucches@redhat.com> (cherry picked from commit ea6102b) --------- Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: NickLucche <nlucches@redhat.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
…roject#32746) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
- [build] fix cu130 related release pipeline steps and publish as nightly image (vllm-project#32522) - [Misc] Replace urllib's `urlparse` with urllib3's `parse_url` (vllm-project#32746) - [Misc] Bump opencv-python dependency version to 4.13 (vllm-project#32668) - [Bugfix] Fix Whisper/encoder-decoder GPU memory leak (vllm-project#32789) - [CI] fix version comparsion and exclusion patterns in upload-release-wheels.sh (vllm-project#32971) - tokenizers: mistral: fix merge conflict - `Dockerfile.tpu.ubi`: add `git` to allow `pip install git+https`
Purpose
urlparsewith urllib3'sparse_urlfor safer and more standard url parsing.Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.