AJ Isaacs 8a06ddbd6e Support hybrid LLM: local Qwen triage + OpenAI escalation
Triage analysis runs on Qwen 8B (athena.lan) for free first-pass.
Escalation, chat, image roasts, and commands use GPT-4o via OpenAI.

Each tier gets its own base URL, API key, and concurrency settings.
Local models get /no_think and serialized requests automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 12:20:07 -05:00
Description
No description provided
1.2 MiB
Languages
Python 98.8%
Shell 0.8%
Dockerfile 0.4%