Add two-tier LLM analysis with triage/escalation

Triage model (LLM_MODEL) handles every message cheaply. If toxicity
>= 0.25, off_topic, or coherence < 0.6, the message is re-analyzed
with the heavy model (LLM_ESCALATION_MODEL). Chat, image analysis,
/bcs-test, and /bcs-scan always use the heavy model.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-21 18:33:36 -05:00
parent 64e9474c99
commit b9bac899f9
5 changed files with 45 additions and 9 deletions
+2 -2
View File
@@ -84,7 +84,7 @@ class ChatCog(commands.Cog):
image_attachment.filename,
user_text[:80],
)
response = await self.bot.llm.analyze_image(
response = await self.bot.llm_heavy.analyze_image(
image_bytes,
SCOREBOARD_ROAST,
user_text=user_text,
@@ -108,7 +108,7 @@ class ChatCog(commands.Cog):
{"role": "user", "content": f"{score_context}\n{message.author.display_name}: {content}"}
)
response = await self.bot.llm.chat(
response = await self.bot.llm_heavy.chat(
list(self._chat_history[ch_id]),
CHAT_PERSONALITY,
on_first_token=start_typing,