Add option to restrict media domains#25783
Conversation
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
This pull request has merge conflicts that must be resolved before it can be |
There was a problem hiding this comment.
Code Review
This pull request introduces a security feature to restrict the domains from which multimodal media can be fetched, aiming to mitigate Server-Side Request Forgery (SSRF) vulnerabilities. The implementation adds an --allowed-media-domains argument and checks it before fetching media from HTTP(S) URLs. While this is a good step, I've identified a critical security flaw where HTTP redirects can bypass this check. I've also included a suggestion to improve the performance of the domain checking logic. The documentation and test changes look good, but the tests should also cover the redirect scenario.
| if url_spec.scheme.startswith("http"): | ||
| self._assert_url_in_allowed_media_domains(url_spec) |
There was a problem hiding this comment.
The current implementation checks the domain of the initial URL, but it does not prevent SSRF vulnerabilities that arise from HTTP redirects. If an allowed URL redirects to a URL on a disallowed domain (including internal network addresses), connection.get_bytes might still fetch it if it follows redirects by default, which is mentioned as the root cause in the security advisory GHSA-3f6c-7fw2-ppm4j.
To properly mitigate this, you should either disable redirects or verify the domain of the final URL after all redirects have been followed. Disabling redirects might be the safest option if they are not a required feature. If they are required, the HTTPConnection class should be modified to not follow redirects automatically, and instead, redirects should be handled manually within MediaConnector to ensure every URL in the redirect chain is validated against the allowed domains.
There was a problem hiding this comment.
@huachenheli @DarkLight1337 this seems like a good point. I wouldn't block merging the current change over this, but it seems worth a follow-up change.
There was a problem hiding this comment.
Yeah. This PR should be fine for immediate needs. We already have media_io_kwargs that we can use to control MediaConnector behavior, so we just need to pass that to the HttpConnection to disallow redirects.
| if allowed_media_domains is None: | ||
| allowed_media_domains = [] | ||
| self.allowed_media_domains = allowed_media_domains |
There was a problem hiding this comment.
For performance, it's better to convert allowed_media_domains to a set during initialization. Checking for an item's existence is O(1) on average for a set, while it's O(n) for a list. This can be significant if the list of allowed domains is large.
| if allowed_media_domains is None: | |
| allowed_media_domains = [] | |
| self.allowed_media_domains = allowed_media_domains | |
| if allowed_media_domains is None: | |
| self.allowed_media_domains = set() | |
| else: | |
| self.allowed_media_domains = set(allowed_media_domains) |
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: simon-mo <simon.mo@hey.com>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: simon-mo <simon.mo@hey.com>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: simon-mo <simon.mo@hey.com>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Chenheli Hua <huachenheli@outlook.com> (cherry picked from commit 3958b96)
…dling Port security fixes from upstream PRs vllm-project#25783 and vllm-project#26035 to address SSRF vulnerability in vLLM's multimodal media handling. Security improvements: - Add domain allowlist via --allowed-media-domains CLI parameter - Add VLLM_MEDIA_URL_ALLOW_REDIRECTS environment variable (defaults to disabled) - Implement domain validation before fetching HTTP(S) URLs - Add redirect control to prevent bypassing domain restrictions The fix provides defense-in-depth protection against SSRF attacks while maintaining backward compatibility (empty allowlist permits all domains). Changes: - vllm/envs.py: Add VLLM_MEDIA_URL_ALLOW_REDIRECTS env var - vllm/connections.py: Add allow_redirects parameter to HTTP methods - vllm/multimodal/utils.py: Add domain validation logic - vllm/config/__init__.py: Add allowed_media_domains configuration field - vllm/engine/arg_utils.py: Add --allowed-media-domains CLI argument - vllm/entrypoints/*.py: Wire configuration through entry points - tests/multimodal/test_utils.py: Add security validation tests - test_security_fixes.py: Add standalone security test suite - CVE-2025-6242-IMPLEMENTATION-SUMMARY.md: Implementation documentation References: - CVE-2025-6242 - Upstream PR vllm-project#25783 (domain restriction) - Upstream PR vllm-project#26035 (redirect control)
https://github.com/vllm-project/vllm/security/advisories/GHSA-3f6c-7fw2-ppm4j