- LLM now evaluates messages against numbered server rules and reports
violated_rules in analysis output
- Warnings and mutes cite the specific rule(s) broken
- Rules extracted to prompts/rules.txt for prompt injection
- Personality prompts moved to prompts/personalities/ and compressed
(~63% reduction across all prompt files)
- All prompt files tightened: removed redundancy, consolidated Do NOT
sections, trimmed examples while preserving behavioral instructions
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds user_aliases config section mapping Discord IDs to known nicknames.
Aliases are anonymized and injected into LLM analysis context so it can
recognize when someone name-drops another member (even absent ones).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
LLM can now flag possessive name-dropping, territorial behavior, and
jealousy signals when users mention others not in the conversation.
Scores feed into existing drama pipeline for warnings/mutes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds LLM triage on bot @mentions to determine if the user is chatting
or reporting bad behavior. Only 'report' intents trigger the 30-message
scan; 'chat' intents skip the scan and let ChatCog handle it.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Filter out non-dict entries from user_findings and handle non-dict
result to prevent 'str' object has no attribute 'setdefault' errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The triage LLM was blending context message content into its reasoning
for new messages (e.g., citing profanity from context when the new
message was just "I'll be here"). Added per-message [CONTEXT] tags
inline and strengthened the prompt to explicitly forbid referencing
context content in reasoning/scores.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The conversation analysis was re-scoring old messages alongside new ones,
causing users to get penalized repeatedly for already-scored messages.
A "--- NEW MESSAGES ---" separator now marks which messages are new, and
the prompt instructs the LLM to score only those. Also fixes bot-mention
detection to require an explicit @mention in message text rather than
treating reply-pings as scans (so toxic replies to bot warnings aren't
silently skipped).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When the LLM returns text instead of a tool call for conversation
analysis, try parsing the content as JSON before giving up. Also
log what the model actually returns on failure for debugging.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switch from per-user message batching to per-channel conversation
analysis. The LLM now sees the full interleaved conversation with
relative timestamps, reply chains, and consecutive message collapsing
instead of isolated flat text per user.
Key changes:
- Fix gpt-5-nano temperature incompatibility (conditional temp param)
- Add mention-triggered scan: users @mention bot to analyze recent chat
- Refactor debounce buffer from (channel_id, user_id) to channel_id
- Replace per-message analyze_message() with analyze_conversation()
returning per-user findings from a single LLM call
- Add CONVERSATION_TOOL schema with coherence, topic, and game fields
- Compact message format: relative timestamps, reply arrows (→),
consecutive same-user message collapsing
- Separate mention scan tasks from debounce tasks
- Remove _store_context/_get_context (conversation block IS the context)
- Escalation timeout config: [30, 60, 120, 240] minutes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
gpt-5-nano and other newer models require max_completion_tokens
instead of max_tokens. The new parameter is backwards compatible
with older models.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Slim down chat_roast.txt — remove anti-repetition rules that were
compensating for the local model (gpt-4o-mini handles this natively).
Remove disagreement detection from analysis prompt, tool schema, and
sentiment handler. Saves ~200 tokens per analysis call.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add frequency_penalty (0.8) and presence_penalty (0.6) to LLM chat
calls to discourage repeated tokens. Inject the bot's last 5 responses
into the system prompt so the model knows what to avoid. Strengthen
the roast prompt with explicit anti-repetition rules and remove example
lines the model was copying verbatim ("Real ___ energy", etc.).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove /no_think override from chat() so Qwen3 reasons before
generating responses (fixes incoherent word-salad replies)
- Analysis and image calls keep /no_think for speed
- Add varied roast style guidance (deadpan, sarcastic, blunt, etc.)
- Explicitly ban metaphors/similes in roast prompt
- Replace metaphor examples with direct roast examples
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Triage analysis runs on Qwen 8B (athena.lan) for free first-pass.
Escalation, chat, image roasts, and commands use GPT-4o via OpenAI.
Each tier gets its own base URL, API key, and concurrency settings.
Local models get /no_think and serialized requests automatically.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Default models: gpt-4o-mini (triage), gpt-4o (escalation)
- Remove Qwen-specific /no_think hacks
- Reduce timeout from 600s to 120s, increase concurrency semaphore to 4
- Support empty LLM_BASE_URL to use OpenAI directly
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The vision model request was hanging indefinitely, freezing the bot.
The streaming loop had no timeout so if the model never returned
chunks, the bot would wait forever. Now times out after 2 minutes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
LLM analysis now detects when two users are in a genuine
disagreement. When detected, the bot creates a native Discord
poll with each user's position as an option.
- Disagreement detection added to LLM analysis tool schema
- Polls last 4 hours with 1 hour per-channel cooldown
- LLM extracts topic, both positions, and usernames
- Configurable via polls section in config.yaml
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Log every LLM call (analysis, chat, image, raw_analyze) to a new
LlmLog table with request type, model, token counts, duration,
success/failure, and truncated request/response payloads. Enables
debugging prompt issues and tracking usage.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Send last ~8 messages from all users (not just others) as a
multi-line chat log with relative timestamps so the LLM can
better understand conversation flow and escalation patterns.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Detect when users discuss a game in the wrong channel (e.g. GTA talk
in #warzone) and send a friendly redirect to the correct channel.
Also add sexual_vulgar category and scoring rules so crude sexual
remarks directed at someone aren't softened by "lmao".
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When @mentioned with an image attachment, the bot now roasts players
based on scoreboard screenshots using the vision model. Text-only
mentions continue to work as before.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The analyze_message and raw_analyze methods had no max_tokens limit,
causing thinking models (Qwen3-VL-32B-Thinking) to generate unlimited
reasoning tokens before responding — taking 5+ minutes per message.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Serialize all LLM requests through an asyncio semaphore to prevent
overloading athena with concurrent requests
- Switch chat() to streaming so the typing indicator only appears once
the model starts generating (not during thinking/loading)
- Increase LLM timeout from 5 to 10 minutes for slow first loads
- Rename ollama_client.py to llm_client.py and self.ollama to self.llm
since the bot uses a generic OpenAI-compatible API
- Update embed labels from "Ollama" to "LLM"
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>