Use gpt-4o-mini for chat/roasts via dedicated LLM_CHAT_MODEL

Add a separate llm_chat client so chat responses use a smarter model (gpt-4o-mini) while analysis stays on the cheap local Qwen3-8B. Falls back to llm_heavy if LLM_CHAT_MODEL is not set. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 16:04:55 -05:00
parent e4239b25c3
commit c258994a2e
2 changed files with 18 additions and 3 deletions
@@ -194,7 +194,7 @@ class ChatCog(commands.Cog):
                if m["role"] == "assistant"
            ][-5:]

-            response = await self.bot.llm.chat(
+            response = await self.bot.llm_chat.chat(
                list(self._chat_history[ch_id]),
                active_prompt,
                on_first_token=start_typing,
@@ -312,7 +312,7 @@ class ChatCog(commands.Cog):
            if m["role"] == "assistant"
        ][-5:]

-        response = await self.bot.llm.chat(
+        response = await self.bot.llm_chat.chat(
            list(self._chat_history[ch_id]),
            active_prompt,
            recent_bot_replies=recent_bot_replies,