Skip to content

Fix MLX response masking and eval parity issues#2

Open
Lyxot wants to merge 4 commits into
mmathew23:explore/mlxfrom
Lyxot:pr/684
Open

Fix MLX response masking and eval parity issues#2
Lyxot wants to merge 4 commits into
mmathew23:explore/mlxfrom
Lyxot:pr/684

Conversation

@Lyxot
Copy link
Copy Markdown

@Lyxot Lyxot commented May 28, 2026

This PR fixes several MLXTrainer/SFTTrainer parity issues around response masking, tokenization, and eval dataset handling.

  1. MLX response-only batches kept fully masked samples
    MLX only checked all-masked batches after batching, while the CUDA/SFT path filters samples whose post-mask labels are entirely -100.
    Commit: 0fc7f14

  2. Plain HF fast tokenizers failed in train_on_responses_only
    MLX unwrapped any object with _tokenizer, but HF fast tokenizers expose _tokenizer as the low-level Rust tokenizer, which is not callable like PreTrainedTokenizerBase.
    Commit: ae977ba

  3. MLX text batching could double-add BOS
    MLX used tokenizer.encode(text) directly, while CUDA/SFT disables special-token insertion when rendered text already starts with BOS or the chat template owns BOS.
    Commit: 5dd5c00

  4. Dict eval_dataset was treated as iterable rows
    MLX passed dict eval datasets directly into batch creation/evaluation, so split names were iterated instead of preparing each eval split separately.
    Commit: 4f70b5f

Copilot AI review requested due to automatic review settings May 28, 2026 15:19
@Lyxot Lyxot changed the title fix(mlx): filter fully masked response samples Fix MLX response masking and eval parity issues May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant