🚀 The feature, motivation and pitch
Models with thinking have shown to improve accuracy. But sometimes they think too much and go into a loop of "wait".
Ovis2.5 9B has introduced a new approach with their enable_thinking_budget and thinking_budget kwargs which would allow us reduce inference time yet allow for some improvements in accuracy by benefiting from the thinking process.
Alternatives
No response
Additional context
No response
Before submitting a new issue...