feat: add security improvements and reliability enhancements by ersintarhan · Pull Request #4 · waybarrios/vllm-mlx

ersintarhan · 2026-01-15T16:10:03Z

Summary

This PR adds several security and reliability improvements:

Security Fixes

Timing attack prevention with secrets.compare_digest()
Rate limiting (--rate-limit flag)
Request timeout (--timeout flag)

Reliability

TempFileManager for auto-cleanup of temp files
Thread-safe _waiting_consumers counter

API Changes

timeout parameter in requests
--timeout and --rate-limit CLI args

Security fixes: - Fix timing attack vulnerability in API key verification using secrets.compare_digest() - Add rate limiting support with sliding window algorithm (--rate-limit flag) - Add request timeout support to prevent resource exhaustion (--timeout flag) Reliability improvements: - Add TempFileManager for automatic cleanup of temporary files (images/videos) - Register temp files with atexit handler for guaranteed cleanup - Fix race condition in RequestOutputCollector with thread-safe locking API changes: - Add `timeout` parameter to ChatCompletionRequest and CompletionRequest - Add --timeout and --rate-limit CLI arguments

The asyncio.wait_for() timeout was not working because the underlying model.generate() and model.chat() calls are synchronous and block the event loop. This change wraps them in asyncio.to_thread() so that the timeout can properly interrupt long-running generation requests. Also adds mise.toml for Python version management (3.12).

waybarrios · 2026-01-16T00:12:18Z

Test Results

All features have been verified and tested:

Feature	Status
Timing attack prevention (`secrets.compare_digest`)	✅ Verified
Rate limiting (`RateLimiter` class)	✅ Verified
Request timeout (`--timeout` flag)	✅ Verified
TempFileManager (auto-cleanup)	✅ Verified
Thread-safe `_waiting_consumers`	✅ Verified

Unit Tests Added

13 new tests covering:

TestRateLimiter: disabled mode, limit enforcement, per-client tracking, thread safety
TestTempFileManager: register/cleanup, cleanup_all, nonexistent files, thread safety
TestRequestOutputCollectorThreadSafety: counter manipulation, has_waiting_consumers
TestRequestTimeoutField: ChatCompletionRequest and CompletionRequest timeout fields
TestAPIKeyVerification: secrets.compare_digest usage

Test Run Output

30 passed, 3 deselected in 1.88s

Ready for merge.

waybarrios · 2026-01-16T00:13:38Z

I made these unit tests so far: 8131452, but let me know if I am missing something @ersintarhan . Btw great job! This is useful!

ersintarhan · 2026-01-16T00:37:56Z

Great tests! I've added a few more to improve coverage:

test_verify_api_key_rejects_invalid - 401 response for invalid keys
test_verify_api_key_accepts_valid - Valid key acceptance
test_rate_limiter_returns_retry_after - Retry-After header when limit exceeded
test_rate_limiter_window_cleanup - Sliding window cleanup behavior

All 34 tests passing now. LGTM!

waybarrios

Great PR approved.

…ching (#4) Gemma 3's model __call__() requires pixel_values as a positional argument, unlike Qwen2-VL which makes it optional. This caused "missing required positional argument: 'pixel_values'" errors when using continuous batching with text-only requests. The MLLMModelWrapper now injects pixel_values=None for text-only requests, enabling Gemma 3 to work with continuous batching and prefix caching. Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

…barrios#4) Wrap json.dumps() in build_json_system_prompt() and parse_json_output() calls with try/except to return HTTP 400 instead of crashing the server when clients send invalid JSON schemas in response_format. Co-authored-by: Raullen <raullenstudio@raullenacstudio.lan> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Refactor streaming to use tested granular event builders instead of inline dict construction, fixing the gap where tested code wasn't production code (waybarrios#13). Fix text omission in completed events (waybarrios#6), add [DONE] sentinel (waybarrios#8), use typed output models to prevent cross-type field leakage (waybarrios#4, waybarrios#5), fix content join separator (waybarrios#10), remove dead code branches (waybarrios#9, waybarrios#11), and warn on unrecognized content types (waybarrios#7). Add Codex CLI setup guide. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Critical: - Wire promote_from_ssd into _schedule_waiting() via _try_promote_ssd_pending() so SSD fetch path is actually functional (was defined but never called) - Add shield-and-await-on-cancel pattern to async_promote() per Golden Rule waybarrios#4 to prevent RAM budget leaks on task cancellation - Add threading.Lock to SSDIndex for all public methods (writer thread and main thread were sharing connection without synchronization) Important: - Fix check_ssd() called twice in scheduler fetch path (wasted SQLite query) - Wire close_ssd_tier() into scheduler reset() for clean shutdown - Make reserve_budget() actually reserve (increment _current_memory) instead of just checking, to prevent budget overcommit during concurrent promotions

ersintarhan added 2 commits January 15, 2026 19:06

add unit tests for security and reliability features

86c6bf8

waybarrios approved these changes Jan 16, 2026

View reviewed changes

waybarrios merged commit 03c60b4 into waybarrios:main Jan 16, 2026

janhilgard mentioned this pull request Mar 21, 2026

Security hardening: fix auth bypass, SSRF, MCP vulnerabilities (issue #68) #70

Closed

7 tasks

Thump604 mentioned this pull request Apr 14, 2026

Security audit: authentication bypass, SSRF, and other vulnerabilities #68

Open

Thump604 mentioned this pull request Apr 14, 2026

security: path traversal via local filesystem paths in multimodal input #320

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add security improvements and reliability enhancements#4

feat: add security improvements and reliability enhancements#4
waybarrios merged 3 commits intowaybarrios:mainfrom
ersintarhan:feat/security-and-reliability

ersintarhan commented Jan 15, 2026

Uh oh!

waybarrios commented Jan 16, 2026

Uh oh!

waybarrios commented Jan 16, 2026 •

edited

Loading

Uh oh!

ersintarhan commented Jan 16, 2026

Uh oh!

waybarrios left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ersintarhan commented Jan 15, 2026

Summary

Security Fixes

Reliability

API Changes

Uh oh!

waybarrios commented Jan 16, 2026

Test Results

Unit Tests Added

Test Run Output

Uh oh!

waybarrios commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ersintarhan commented Jan 16, 2026

Uh oh!

waybarrios left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

waybarrios commented Jan 16, 2026 •

edited

Loading