Sends a minimal 1-token completion during setup_hook so the model is
ready before Discord messages start arriving, avoiding connection
errors and slow first responses after a restart.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Serialize all LLM requests through an asyncio semaphore to prevent
overloading athena with concurrent requests
- Switch chat() to streaming so the typing indicator only appears once
the model starts generating (not during thinking/loading)
- Increase LLM timeout from 5 to 10 minutes for slow first loads
- Rename ollama_client.py to llm_client.py and self.ollama to self.llm
since the bot uses a generic OpenAI-compatible API
- Update embed labels from "Ollama" to "LLM"
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Discord bot for monitoring chat sentiment and tracking drama using
Ollama LLM on athena.lan. Includes sentiment analysis, slash commands,
drama tracking, and SQL Server persistence via Docker Compose.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>