feat: repetition detector for degenerate token loops#65
feat: repetition detector for degenerate token loops#65janhilgard wants to merge 1 commit intowaybarrios:mainfrom
Conversation
Adds a lightweight repetition detector to the scheduler that monitors the last 32 generated tokens per request and stops generation when degenerate patterns are detected: - Single-token repetition (8+ identical tokens) - Short sequence repetition (2-4 token patterns repeated 6+ times) This prevents runaway generation when models enter degenerate loops, saving compute and improving reliability for long-running requests. Includes 15 unit tests covering all detection patterns and edge cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Moves repetition detection logic to feature/repetition-detector branch (PR waybarrios#65) per review feedback on PR waybarrios#53. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
How is this different from repetition penalties or DRY? |
|
Good question! They solve different problems: Repetition penalty / DRY are preventative — they modify logits during sampling to discourage repetition before it happens. They work well most of the time. This detector is a safety net — it doesn't touch sampling at all. It monitors output and terminates generation when degenerate loops have already formed. Think of it as a circuit breaker for the server. Why both are needed:
The overhead is near-zero (list append + periodic check on a 32-token window), so it's cheap insurance. |
Detect and stop repeating token patterns during generation. Sliding window (200 tokens), checks every 20 tokens for patterns of length 2-50 repeated 3+ times. Enabled via --repetition-detector. Prevents stuck loops that waste up to 13 minutes on large models. Addresses waybarrios#65.
|
Closing in favor of #188 which has a cleaner architecture — standalone reusable #188 currently covers SimpleEngine only. Happy to help integrate the same |
Summary
finish_reason="stop"when degenerate patterns are detected:0 0 0 0 0 0 0 0)ab ab ab ab ab ab)Split out from PR #53 per review feedback — this touches the scheduler hot path and is independent of the GPT-OSS reasoning parser.
Test plan
tests/test_repetition_detector.py)🤖 Generated with Claude Code