Breehavior-Monitor

Author	SHA1	Message	Date
AJ Isaacs	c258994a2e	Use gpt-4o-mini for chat/roasts via dedicated LLM_CHAT_MODEL Add a separate llm_chat client so chat responses use a smarter model (gpt-4o-mini) while analysis stays on the cheap local Qwen3-8B. Falls back to llm_heavy if LLM_CHAT_MODEL is not set. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 16:04:55 -05:00
AJ Isaacs	8a06ddbd6e	Support hybrid LLM: local Qwen triage + OpenAI escalation Triage analysis runs on Qwen 8B (athena.lan) for free first-pass. Escalation, chat, image roasts, and commands use GPT-4o via OpenAI. Each tier gets its own base URL, API key, and concurrency settings. Local models get /no_think and serialized requests automatically. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 12:20:07 -05:00
AJ Isaacs	28fb66d5f9	Switch LLM backend from llama.cpp/Qwen to OpenAI - Default models: gpt-4o-mini (triage), gpt-4o (escalation) - Remove Qwen-specific /no_think hacks - Reduce timeout from 600s to 120s, increase concurrency semaphore to 4 - Support empty LLM_BASE_URL to use OpenAI directly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 12:07:53 -05:00
AJ Isaacs	0feef708ea	Set bot status from active mode on startup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 09:27:34 -05:00
AJ Isaacs	6e1a73847d	Persist bot mode across restarts via database Adds a BotSettings key-value table. The active mode is saved when changed via /bcs-mode and restored on startup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 09:26:00 -05:00
AJ Isaacs	13a2030021	Add switchable bot modes: default, chatty, and roast Adds a server-wide mode system with /bcs-mode command. - Default: current hall-monitor behavior unchanged - Chatty: friendly chat participant with proactive replies (~10% chance) - Roast: savage roast mode with proactive replies - Chatty/roast use relaxed moderation thresholds - 5-message cooldown between proactive replies per channel - Bot status updates to reflect active mode - /bcs-status shows current mode and effective thresholds Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 08:59:51 -05:00
AJ Isaacs	b04d3da2bf	Add LLM request/response logging to database Log every LLM call (analysis, chat, image, raw_analyze) to a new LlmLog table with request type, model, token counts, duration, success/failure, and truncated request/response payloads. Enables debugging prompt issues and tracking usage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 22:55:19 -05:00
AJ Isaacs	b9bac899f9	Add two-tier LLM analysis with triage/escalation Triage model (LLM_MODEL) handles every message cheaply. If toxicity >= 0.25, off_topic, or coherence < 0.6, the message is re-analyzed with the heavy model (LLM_ESCALATION_MODEL). Chat, image analysis, /bcs-test, and /bcs-scan always use the heavy model. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 18:33:36 -05:00
AJ Isaacs	cf88f003ba	Add LLM warm-up request at startup to preload model into VRAM Sends a minimal 1-token completion during setup_hook so the model is ready before Discord messages start arriving, avoiding connection errors and slow first responses after a restart. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 15:16:52 -05:00
AJ Isaacs	1151b705c0	Add LLM request queue, streaming chat, and rename ollama_client to llm_client - Serialize all LLM requests through an asyncio semaphore to prevent overloading athena with concurrent requests - Switch chat() to streaming so the typing indicator only appears once the model starts generating (not during thinking/loading) - Increase LLM timeout from 5 to 10 minutes for slow first loads - Rename ollama_client.py to llm_client.py and self.ollama to self.llm since the bot uses a generic OpenAI-compatible API - Update embed labels from "Ollama" to "LLM" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 13:45:12 -05:00
AJ Isaacs	a35705d3f1	Initial commit: Breehavior Monitor Discord bot Discord bot for monitoring chat sentiment and tracking drama using Ollama LLM on athena.lan. Includes sentiment analysis, slash commands, drama tracking, and SQL Server persistence via Docker Compose. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 22:39:40 -05:00

11 Commits