Skip to content

Conversation

@lanej
Copy link

@lanej lanej commented Nov 8, 2025

Summary

Adds RFC 2387 multipart/related support for uploading files with metadata in a single request.

Fixes #1240

Background

Per RFC 2387, multipart/related provides a mechanism for representing compound objects that are aggregates of related MIME body parts. This is commonly used by APIs like Google Drive to upload a file and its metadata in a single HTTP request.

Key RFC 2387 Requirements

Part Ordering: Section 3.2 specifies that "If not present the 'root' is the first body part in the Multipart/Related entity." This implementation preserves the OpenAPI schema property order to ensure the root part (typically JSON metadata) appears first.

Content-Type: Section 3.1 requires that each part have an appropriate Content-Type header. Many APIs reject the generic application/octet-stream and require the actual MIME type (e.g., text/csv, image/png).

Type Parameter: The multipart/related Content-Type header's type parameter must specify the MIME type of the root part (Section 3.2).

Changes

Runtime Support (progenitor-client)

  • New MultipartRelatedBody trait for types that can be serialized as multipart/related
  • MultipartPart struct with dynamic content_type: String field
  • RFC 2387-compliant multipart/related body construction with proper boundaries
  • Dynamic type parameter derived from first part's content-type

Code Generation (progenitor-impl)

  • Expanded builder API: multipart/related schemas generate individual field methods
  • Dynamic Content-Type: Binary field methods require MIME type as second parameter
  • Property Order Preservation: Parts are serialized in OpenAPI schema order (not alphabetical)
  • Optional Field Handling: Optional binary fields generate Option<String> for content_type and are excluded from the body when not provided
  • Automatic content_type field generation for binary parts

Related Improvements

  • Default parameter values: Optional parameters with OpenAPI default values now initialize correctly
  • Proper Vec handling for binary fields

API Example

Before (not possible):

// multipart/related was not supported

After:

client.upload_file()
    .file(csv_bytes, "text/csv")  // ← Content-type required at compile time
    .metadata(FileMetadata {
        name: "data.csv",
        mime_type: "text/csv"
    })
    .send()
    .await

Optional fields (automatically excluded when not provided):

// attachment_content_type is Option<String> - only required if attachment is provided
client.upload_multiple()
    .document(doc_bytes, "application/pdf")
    .thumbnail(thumb_bytes, "image/png")
    .metadata(metadata)
    // .attachment() - omit optional field
    .send()
    .await

Generated Code

Struct with content_type field:

pub struct UploadFileMultipartParts {
    pub file: Vec<u8>,
    pub file_content_type: String,  // ← Required field
    pub metadata: FileMetadata,
}

// Optional fields use Option<String>
pub struct UploadMultipleFilesMultipartParts {
    pub document: Vec<u8>,
    pub document_content_type: String,
    pub thumbnail: Vec<u8>,
    pub thumbnail_content_type: String,
    #[serde(default, skip_serializing_if = "Vec::is_empty")]
    pub attachment: Vec<u8>,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub attachment_content_type: Option<String>,  // ← Optional!
    pub metadata: FileMetadata,
}

MultipartRelatedBody implementation respects schema order and skips optional fields:

impl MultipartRelatedBody for UploadMultipleFilesMultipartParts {
    fn as_multipart_parts(&self) -> Vec<MultipartPart> {
        vec![
            // metadata first (matches schema order)
            Some(MultipartPart {
                content_type: "application/json".to_string(),
                content_id: "metadata",
                bytes: ::serde_json::to_vec(&self.metadata)
                    .expect("failed to serialize field"),
            }),
            // Required fields always included
            Some(MultipartPart {
                content_type: self.document_content_type.clone(),
                content_id: "document",
                bytes: self.document.clone(),
            }),
            Some(MultipartPart {
                content_type: self.thumbnail_content_type.clone(),
                content_id: "thumbnail",
                bytes: self.thumbnail.clone(),
            }),
            // Optional field: only included if content_type is Some and bytes non-empty
            if let Some(ref content_type) = self.attachment_content_type {
                if !self.attachment.is_empty() {
                    Some(MultipartPart {
                        content_type: content_type.clone(),
                        content_id: "attachment",
                        bytes: self.attachment.clone(),
                    })
                } else {
                    None
                }
            } else {
                None
            },
        ]
        .into_iter()
        .flatten()  // Remove None values
        .collect()
    }
}

Design Decisions

Why require content_type at compile time?

Making content_type a required parameter (not defaulting to application/octet-stream) prevents silent bugs. APIs like Google Drive reject application/octet-stream with a 400 error, so forcing users to specify the MIME type makes the API safer.

Why preserve schema order?

RFC 2387 Section 3.2 specifies that the first part is the "root" by default. Many APIs expect metadata (JSON) before file content. Alphabetical sorting would reverse this (file < metadata), breaking these APIs.

Why Option for optional content_type?

Optional binary fields in the OpenAPI schema should not require a content-type if the field isn't being used. Typify generates Vec<u8> with #[serde(default)] for optional binary fields, so we check both content_type presence and non-empty bytes before including the part.

Why derive type parameter dynamically?

RFC 2387 Section 3.2 requires the type parameter to specify the content-type of the root (first) part. Hardcoding application/json would be incorrect for binary-only multipart bodies.

Testing

  • Three test scenarios: single file, multiple files with optional attachment, and raw body
  • Generated code verified for correct part ordering
  • Optional field handling verified (excluded when not provided)
  • All existing tests pass with no breaking changes
  • New test: query-param-defaults validates default value handling

Breaking Changes

None - this feature is new in this PR.

@lanej lanej force-pushed the issue-1240-multipart-related branch from 6f04ec1 to ab478f1 Compare November 8, 2025 17:17
@lanej lanej marked this pull request as draft November 8, 2025 17:22
@lanej lanej force-pushed the issue-1240-multipart-related branch from ab478f1 to af4e9af Compare November 8, 2025 17:32
@lanej lanej marked this pull request as ready for review November 8, 2025 18:22
@lanej lanej marked this pull request as draft November 8, 2025 19:53
lanej added 3 commits November 9, 2025 14:12
Add runtime support for RFC 2387 multipart/related requests to enable
Google Workspace API uploads that combine JSON metadata with binary
content in a single request.

- Add MultipartPart struct with zero-copy semantics using Cow<'a, [u8]>
  to avoid doubling memory usage for large files
- Add MultipartRelatedBody trait for types that can be serialized as
  RFC 2387 bodies
- Add RequestBuilderExt::multipart_related() to apply multipart/related
  body to requests

Security features:
- RFC 2045 parameter quoting for content-type values with special chars
- Content-ID validation to prevent header injection attacks
- Empty parts validation (RFC 2387 requires at least one part)

Performance optimizations:
- Buffer preallocation based on estimated message size
- Unique boundary generation using timestamp (nanos) + PID + counter
- Zero-copy for binary fields via Cow::Borrowed

RFC 2387 compliance:
- Type parameter derived from first part's content-type (root part)
- Proper Content-Type and Content-ID headers for each part
- CRLF line endings throughout
- Proper boundary markers

Fixes oxidecomputer#1240
Add code generation for multipart/related request bodies from OpenAPI
schemas. Supports both query parameter defaults (separate feature) and
multipart/related (issue oxidecomputer#1240).

Multipart/related support:
- Detect x-rust-type: multipart/related in schemas
- Generate {field}_content_type fields for binary fields
- Preserve property order from schema (RFC 2387 Section 3.2)
- Handle required vs optional fields correctly
- Generate builder methods with (value, content_type) parameters
- Use crate::progenitor_client:: for module resolution
- Add serde_json dependency when needed
- Blanket impl for &T allows .multipart_related(&body)

Query parameter defaults support (separate feature):
- Respect OpenAPI default values for optional parameters
- Extract default from schema and pass to builder
- Update builder methods to use defaults

Generated MultipartRelatedBody impls:
- JSON fields serialized via serde_json::to_vec as Cow::Owned
- Binary fields as Cow::Borrowed (zero-copy)
- Optional fields checked for Some and non-empty
- Parts ordered according to schema property order

For oxidecomputer#1240
Add test coverage for both multipart/related and query param defaults.

Multipart/related tests:
- OpenAPI spec with multipart/related endpoints
- Test scenarios: builder, positional, tagged
- Single file upload (required fields)
- Multiple file upload (mixed required/optional)
- Simple upload (raw body, no schema)
- Verify generated code compiles

Query param defaults tests:
- OpenAPI spec with default values
- Test scenarios across all generation styles
- Verify defaults are respected in builder

All generated code successfully compiles and passes tests.

For oxidecomputer#1240
@lanej lanej force-pushed the issue-1240-multipart-related branch from 4faa1e0 to 12e18f1 Compare November 9, 2025 22:27
@lanej lanej marked this pull request as ready for review November 9, 2025 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for multipart/related content type (RFC 2387)

1 participant