b41020014672efb59596f4ff434fd6c7628d6635
The analyze_message and raw_analyze methods had no max_tokens limit, causing thinking models (Qwen3-VL-32B-Thinking) to generate unlimited reasoning tokens before responding — taking 5+ minutes per message. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Description
No description provided
Languages
Python
98.8%
Shell
0.8%
Dockerfile
0.4%