Skip to content

Align batch logits processor token contract#1115

Merged
angeloskath merged 3 commits intoml-explore:mainfrom
neilmehta24:batched-gen-logits-processor
Apr 7, 2026
Merged

Align batch logits processor token contract#1115
angeloskath merged 3 commits intoml-explore:mainfrom
neilmehta24:batched-gen-logits-processor

Conversation

@neilmehta24
Copy link
Copy Markdown
Contributor

After #1072, the data sent to the batched logits processors changed:

  1. The type of self.tokens[e] is now a python list[int], not mx.array. This means that the logits processors receive python lists.
  2. The current input token is now appending to self.tokens after the processor call, not before.

I believe this is a bug for two reasons:

  • For the non-batched API, processors still get an mx.array, and the current input token is appended before the processor runs.
  • There's a comment that says # Update the token context that will be used by the logits processors, but see how _token_context is not being used within the logits processors block.

This PR brings the data sent to the logits processors in line with the pre-refactor data.

@angeloskath
Copy link
Copy Markdown
Member

Thanks for catching that. That's a typo actually. They should be receiving self._token_context[e] that's just above. This also limits the max context to 256 tokens which is fairly arbitrary and different than the non-batched API. Let me know if you think that's a problem. As you see it is fairly easy to change.

@angeloskath angeloskath merged commit dcbf6e3 into ml-explore:main Apr 7, 2026
2 checks passed
@neilmehta24 neilmehta24 deleted the batched-gen-logits-processor branch April 7, 2026 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants