Breehavior-Monitor

Author	SHA1	Message	Date
aj	b410200146	Add max_tokens=1024 to LLM analysis calls The analyze_message and raw_analyze methods had no max_tokens limit, causing thinking models (Qwen3-VL-32B-Thinking) to generate unlimited reasoning tokens before responding — taking 5+ minutes per message. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 14:17:59 -05:00
aj	1151b705c0	Add LLM request queue, streaming chat, and rename ollama_client to llm_client - Serialize all LLM requests through an asyncio semaphore to prevent overloading athena with concurrent requests - Switch chat() to streaming so the typing indicator only appears once the model starts generating (not during thinking/loading) - Increase LLM timeout from 5 to 10 minutes for slow first loads - Rename ollama_client.py to llm_client.py and self.ollama to self.llm since the bot uses a generic OpenAI-compatible API - Update embed labels from "Ollama" to "LLM" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 13:45:12 -05:00
aj	645b924011	Extract LLM prompts to separate text files and fix quoting penalty Move the analysis and chat personality system prompts from inline Python strings to prompts/analysis.txt and prompts/chat_personality.txt for easier editing. Also add a rule so users quoting/reporting what someone else said are not penalized for the quoted words. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 12:19:28 -05:00
aj	a35705d3f1	Initial commit: Breehavior Monitor Discord bot Discord bot for monitoring chat sentiment and tracking drama using Ollama LLM on athena.lan. Includes sentiment analysis, slash commands, drama tracking, and SQL Server persistence via Docker Compose. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 22:39:40 -05:00

4 Commits