115 Commits

Author SHA1 Message Date
b04d3da2bf Add LLM request/response logging to database
Log every LLM call (analysis, chat, image, raw_analyze) to a new
LlmLog table with request type, model, token counts, duration,
success/failure, and truncated request/response payloads. Enables
debugging prompt issues and tracking usage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 22:55:19 -05:00
fd798ce027 Silently log LLM failures instead of replying to user
When the LLM is offline, post to #bcs-log instead of sending
the "brain offline" message in chat.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 16:55:07 -05:00
85ddba5e4b Lower mute thresholds and order warnings before chat replies
- spike_mute: 0.8→0.7, mute: 0.75→0.65 so escalating users get
  timed out after a warning instead of endlessly warned
- Skip debounce on @mentions so sentiment analysis fires immediately
- Chat cog awaits pending sentiment analysis before replying,
  ensuring warnings/mutes appear before the personality response

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 14:16:34 -05:00
e2404d052c Improve LLM context with full timestamped channel history
Send last ~8 messages from all users (not just others) as a
multi-line chat log with relative timestamps so the LLM can
better understand conversation flow and escalation patterns.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 14:04:30 -05:00
b9bac899f9 Add two-tier LLM analysis with triage/escalation
Triage model (LLM_MODEL) handles every message cheaply. If toxicity
>= 0.25, off_topic, or coherence < 0.6, the message is re-analyzed
with the heavy model (LLM_ESCALATION_MODEL). Chat, image analysis,
/bcs-test, and /bcs-scan always use the heavy model.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 18:33:36 -05:00
64e9474c99 Add message batching (debounce) for rapid-fire senders
Buffer messages per user+channel and wait for a configurable window
(batch_window_seconds: 3) before analyzing. Combines burst messages
into a single LLM call instead of analyzing each one separately.
Replaces cooldown_between_analyses with the debounce approach.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 18:19:01 -05:00
cf02da4051 Add CLAUDE.md with deployment instructions
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:09:19 -05:00
fee3e3e1bd Add game channel redirect feature and sexual_vulgar detection
Detect when users discuss a game in the wrong channel (e.g. GTA talk
in #warzone) and send a friendly redirect to the correct channel.
Also add sexual_vulgar category and scoring rules so crude sexual
remarks directed at someone aren't softened by "lmao".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:02:59 -05:00
e41845de02 Add scoreboard roast feature via image analysis
When @mentioned with an image attachment, the bot now roasts players
based on scoreboard screenshots using the vision model. Text-only
mentions continue to work as before.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 16:30:26 -05:00
cf88f003ba Add LLM warm-up request at startup to preload model into VRAM
Sends a minimal 1-token completion during setup_hook so the model is
ready before Discord messages start arriving, avoiding connection
errors and slow first responses after a restart.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 15:16:52 -05:00
b410200146 Add max_tokens=1024 to LLM analysis calls
The analyze_message and raw_analyze methods had no max_tokens limit,
causing thinking models (Qwen3-VL-32B-Thinking) to generate unlimited
reasoning tokens before responding — taking 5+ minutes per message.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 14:17:59 -05:00
1151b705c0 Add LLM request queue, streaming chat, and rename ollama_client to llm_client
- Serialize all LLM requests through an asyncio semaphore to prevent
  overloading athena with concurrent requests
- Switch chat() to streaming so the typing indicator only appears once
  the model starts generating (not during thinking/loading)
- Increase LLM timeout from 5 to 10 minutes for slow first loads
- Rename ollama_client.py to llm_client.py and self.ollama to self.llm
  since the bot uses a generic OpenAI-compatible API
- Update embed labels from "Ollama" to "LLM"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 13:45:12 -05:00
645b924011 Extract LLM prompts to separate text files and fix quoting penalty
Move the analysis and chat personality system prompts from inline Python
strings to prompts/analysis.txt and prompts/chat_personality.txt for
easier editing. Also add a rule so users quoting/reporting what someone
else said are not penalized for the quoted words.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 12:19:28 -05:00
63b4b3adb8 Fix missing libgssapi-krb5-2 dependency in Docker image
The ODBC driver failed to load at runtime because libgssapi_krb5.so.2
was not installed. Add it explicitly to the apt-get install step.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 22:48:15 -05:00
a35705d3f1 Initial commit: Breehavior Monitor Discord bot
Discord bot for monitoring chat sentiment and tracking drama using
Ollama LLM on athena.lan. Includes sentiment analysis, slash commands,
drama tracking, and SQL Server persistence via Docker Compose.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 22:39:40 -05:00