Skip to content

Conversation

@tonyluj
Copy link
Contributor

@tonyluj tonyluj commented Oct 31, 2025

Motivation

This PR implements WebAssembly-based extensibility for sgl-router, enabling dynamic, safe, and portable middleware execution without requiring router restarts or recompilation. This addresses the feature request in #10902 .

Modifications

Core Infrastructure

  • WASM Runtime Integration: Integrated wasmtime as the primary runtime with async support and WebAssembly Component Model
  • WasmModuleManager: Module lifecycle management (add/remove/list) with SHA256 deduplication
  • WasmRuntime: Execution engine with thread pool for isolated WASM component execution
  • WIT Interface: Type-safe communication using WebAssembly Interface Types (wit/spec.wit)
  • HTTP API Endpoints: RESTful API at /wasm for dynamic module management:
    • POST /wasm - Deploy modules
    • GET /wasm - List all modules and metrics
    • DELETE /wasm/{uuid} - Remove a module

Middleware Support

  • Attach Points: Support for OnRequest and OnResponse lifecycle hooks
  • Actions: Three action types:
    • Continue - Proceed to next middleware/upstream
    • Reject(status) - Return error response immediately
    • Modify - Modify headers, body, or status code
  • Integration: Seamlessly integrated into existing middleware chain

Security & Resource Management

  • Sandboxing: WASM modules run in isolated environments via wasmtime
  • Resource Limits: Configurable limits for:
    • Memory (max_memory_pages)
    • Execution time (max_execution_time_ms)
    • Stack size (max_stack_size)
  • Validation: WASM component validation at load time
  • Error Handling: Graceful degradation - failed executions don't crash the router

Configuration

  • Added enable_wasm flag to RouterConfig
  • Added RouterConfigBuilder::enable_wasm(bool) method
  • WASM manager initialized conditionally based on config

Examples

Three complete example implementations:

  1. wasm-guest-auth: API key authentication middleware

    • Validates API keys from Authorization header or x-api-key
    • Returns 401 Unauthorized for invalid/missing keys
  2. wasm-guest-logging: Request tracking and status conversion

    • Adds tracking headers (x-request-id, x-wasm-processed, etc.)
    • Converts 500 errors to 503
  3. wasm-guest-ratelimit: Rate limiting middleware

    • Configurable per-identifier rate limits
    • Returns 429 Too Many Requests when exceeded

Metrics

  • Execution metrics exposed via /wasm endpoint:
    • Total executions
    • Successful/failed executions
    • Execution time statistics

Implementation Details

Architecture

src/wasm/
├── module.rs           # Data structures (metadata, types, attach points)
├── module_manager.rs   # Module lifecycle management
├── runtime.rs          # WASM execution engine and thread pool
├── route.rs            # HTTP API endpoints
├── spec.rs             # WIT bindings and type conversions
├── types.rs            # Generic input/output types
├── errors.rs           # Error definitions
├── config.rs           # Runtime configuration
└── wit/
    └── spec.wit         # WebAssembly Interface Types definitions

WIT Interface

Uses WebAssembly Component Model with WIT for type-safe communication:

  • middleware-on-request::on-request(req: Request) -> Action
  • middleware-on-response::on-response(resp: Response) -> Action

Execution Flow

  1. HTTP request arrives at router
  2. Middleware chain checks for WASM modules attached to OnRequest
  3. For each module:
    • Module manager retrieves pre-loaded WASM bytes
    • Runtime executes component in isolated worker thread
    • Component processes request via WIT interface
    • Returns Action (Continue/Reject/Modify)
  4. Apply action and proceed/reject/modify as needed
  5. After upstream response: modules attached to OnResponse process response

Usage

Enable WASM Support

./sgl-router --enable-wasm --worker-urls=http://0.0.0.0:30000 --port=3000

Deploy a Module

curl -X POST http://localhost:3000/wasm \
  -H "Content-Type: application/json" \
  -d '{
    "modules": [{
      "name": "my-middleware",
      "file_path": "/path/to/my-component.component.wasm",
      "module_type": "Middleware",
      "attach_points": [{"Middleware": "OnRequest"}]
    }]
  }'

List Modules

curl http://localhost:3000/wasm

Testing

  • All existing tests pass
  • Added integration for WASM module management
  • Examples tested with real WASM components

Checklist

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @tonyluj, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the sgl-router by introducing WebAssembly (WASM) support for custom middleware. This new capability allows users to extend the router's functionality with custom logic that can be deployed and managed dynamically, without needing to recompile or restart the router. The integration provides a secure, sandboxed environment for these modules, ensuring stability and performance while offering flexible control over request and response processing.

Highlights

  • WASM Middleware Support: Introduced WebAssembly (WASM) based extensibility for the sgl-router, enabling dynamic, safe, and portable middleware execution without requiring router restarts or recompilation.
  • WASM Runtime Integration: Integrated wasmtime as the primary runtime, featuring asynchronous support and leveraging the WebAssembly Component Model for isolated and efficient execution.
  • Dynamic Module Management API: Implemented RESTful HTTP API endpoints (/wasm) for dynamic management of WASM modules, including deployment (POST /wasm), listing (GET /wasm), and removal (DELETE /wasm/{uuid}).
  • Middleware Lifecycle Hooks: Added support for OnRequest and OnResponse lifecycle hooks, allowing WASM modules to intercept and modify HTTP requests and responses with actions like Continue, Reject, or Modify.
  • Security and Resource Management: Ensured WASM modules run in sandboxed environments with configurable resource limits for memory, execution time, and stack size, preventing router crashes from module failures.
  • Configuration and Examples: Added an enable_wasm flag to RouterConfig and provided three comprehensive example WASM middlewares: API key authentication, request logging, and rate limiting.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@tonyluj tonyluj force-pushed the tonyluj/sgl-router-wasm-latest branch from 2911de1 to f06aade Compare October 31, 2025 16:49
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a comprehensive and well-structured pull request that adds significant new functionality with WASM-based middleware. The implementation covers runtime integration, a management API, security sandboxing, and helpful examples. My feedback focuses on improving robustness by handling potential panics during startup, enhancing maintainability by suggesting refactoring to reduce code duplication, and promoting safer coding practices in the example modules by avoiding unsafe code. I have also identified some potentially unused code that could be removed.


This middleware validates API keys for requests to `/api` and `/v1` paths:

- Supports `Authorization: Bearer <key>` header
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

api key is already supported
perhaps we can leave example in a different PR?
for example, write an example using other languages

types::{WasmComponentInput, WasmComponentOutput},
};

pub struct WasmModuleManager {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use workflow for this?
i had similar manager for worker
and it just kept on growing eventually i had to write a simple generic workflow
perhaps wasm can leverage the same workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants