Skip to content

Fix batch mamba#842

Merged
awni merged 2 commits intomainfrom
fix_batch_mamba
Feb 4, 2026
Merged

Fix batch mamba#842
awni merged 2 commits intomainfrom
fix_batch_mamba

Conversation

@awni
Copy link
Copy Markdown
Member

@awni awni commented Feb 4, 2026

Fixes a couple things in batch generation:

  1. Check if a request / model pair are batchable after loading the model
  2. Get rid of MambaCache since it was causing issues trying to merge it in the ArraysCache.merge method. (Size parameter was interpreted as the left padding).

@awni awni requested a review from angeloskath February 4, 2026 00:24
Comment thread mlx_lm/server.py

# We have no batch and it actually is not a batchable request
# so serve single sequence at a time.
elif batch_generator is None and not is_batchable:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity why the change here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because before we load the model it defaults to not batchable. So the first time we check _is_batchable it evaluates to false every time (assuming the model wasn't passed to the server but loaded from the request).

That means that for every first request we will use _serve_single and be in sequential mode even when it could have been batched.

Copy link
Copy Markdown
Member

@angeloskath angeloskath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

I left a comment on the moving the _serve_single() around, not against it but wondering why since to me it is a bit more understandable as it were.

@awni awni merged commit e08ec15 into main Feb 4, 2026
2 checks passed
@awni awni deleted the fix_batch_mamba branch February 4, 2026 03:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants