Skip to content

Conversation

yayashuxue
Copy link

@yayashuxue yayashuxue commented Sep 9, 2025

MLX-Swift v2.25.7 Compatibility Changes

Overview

This document summarizes the changes made to ensure FastVLM compatibility with MLX-Swift-Examples v2.25.7 (Latest package)

Files Modified

  • .gitignore (+4 lines)
  • app/FastVLM/FastVLM.swift (+27 lines, -9 lines)

Detailed Changes

1. .gitignore Updates

Added additional Xcode-related entries to prevent tracking of user-specific files:

xcuserdata/
*.xcuserstate
*.xcuserdatad/
project.pbxproj.orig

2. FastVLM.swift - Attention Mask API Changes

Before (v2.25.6 and earlier):

let mask = createAttentionMask(h: h, cache: cache)
h = layer(h, mask: mask, cache: cache?[i])

After (v2.25.7 compatible):

let mask = createAttentionMask(h: h, cache: cache, returnArray: true)
let maskArray: MLXArray? = {
    switch mask {
    case .array(let array):
        return array
    case .causal, .none:
        return nil
    case .arrays(_):
        return nil
    }
}()

for (i, layer) in layers.enumerated() {
    h = layer(h, mask: maskArray, cache: cache?[i])
}

Key Changes:

  • Added returnArray: true parameter to createAttentionMask()
  • Extract MLXArray from the returned AttentionMask enum
  • Pass MLXArray? instead of AttentionMask to layer functions

3. FastVLMProcessor - Message Handling Improvements

Before:

var messages = prompt.asMessages()
if messages[0]["role"] != "system" {
    // ...
}
var lastMessage = messages[lastIndex]["content"] ?? ""

After:

var messages: [Message]
switch prompt {
case .text(let text):
    messages = [["role": "user", "content": text]]
case .messages(let msgs):
    messages = msgs
case .chat(let chatMsgs):
    messages = chatMsgs.map { ["role": $0.role.rawValue, "content": $0.content] }
}
if (messages[0]["role"] as? String) != "system" {
    // ...
}
var lastMessage: String = (messages[lastIndex]["content"] as? String) ?? ""

Key Changes:

  • Explicit type casting for dictionary values
  • Proper handling of different prompt types
  • Improved type safety with explicit String casting

4. Image Token Processing

Before:

lastMessage += Array(repeating: config.imageToken, count: numImageTokens)
    .joined()

After:

let imageTokens = String(repeating: config.imageToken, count: numImageTokens)
lastMessage += imageTokens

Key Changes:

  • Simplified string repetition using String(repeating:count:)
  • More efficient than creating array and joining

5. LMInput.ProcessedImage API Update

Before:

let image = LMInput.ProcessedImage(pixels: pixels, imageGridThw: [thw])
let gridThw = input.image?.imageGridThw

After:

let image = LMInput.ProcessedImage(pixels: pixels, frames: [thw])
let gridThw = input.image?.frames

Key Changes:

  • Property renamed from imageGridThw to frames
  • Maintains same functionality with updated API

Impact

  • ✅ Maintains backward compatibility in functionality
  • ✅ Fixes compilation errors with MLX-Swift-Examples v2.25.7
  • ✅ Improves type safety in message processing
  • ✅ Tested and working on iOS device

Testing Status

  • Compilation successful with MLX-Swift-Examples v2.25.7
  • iOS device testing completed

Resolves API compatibility issues that prevented FastVLM from building
with the latest MLX library dependencies.

## Changes Made

- Fix UserInput.Prompt API changes (enum-based instead of asMessages())
- Update ProcessedImage constructor to use 'frames' parameter
- Handle ScaledDotProductAttentionMaskMode with exhaustive switch cases
- Add proper type annotations for Swift compiler inference
- Use existing createAttentionMask from MLXLMCommon correctly
- Add Xcode user-specific files to .gitignore

## Technical Details

- Tested with MLX-Swift v0.25.6, MLX-Swift-Examples v2.25.7
- Builds successfully on iOS 18.6.2
- No breaking changes to functionality

Fixes all compilation errors while maintaining compatibility.
@yayashuxue yayashuxue marked this pull request as draft September 10, 2025 20:59
@yayashuxue yayashuxue marked this pull request as ready for review September 10, 2025 20:59
@yayashuxue
Copy link
Author

@federicobucchi @rwebb key changes focus on making the ios build compatible with the latest XLM package

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant