Extract LLM prompts to separate text files and fix quoting penalty
Move the analysis and chat personality system prompts from inline Python strings to prompts/analysis.txt and prompts/chat_personality.txt for easier editing. Also add a rule so users quoting/reporting what someone else said are not penalized for the quoted words. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1,44 +1,14 @@
|
||||
import json
|
||||
import logging
|
||||
from pathlib import Path
|
||||
|
||||
from openai import AsyncOpenAI
|
||||
|
||||
logger = logging.getLogger("bcs.llm")
|
||||
|
||||
SYSTEM_PROMPT = """You are a Discord chat moderator AI for a gaming server. You will be given a TARGET MESSAGE to analyze, plus recent channel context for background.
|
||||
_PROMPTS_DIR = Path(__file__).resolve().parent.parent / "prompts"
|
||||
|
||||
CRITICAL: Only score the TARGET MESSAGE. The context is ONLY for understanding tone and conversation flow. Do NOT score the context messages — they belong to other users and are already being analyzed separately.
|
||||
|
||||
CONTEXT — This is a friend group who use crude nicknames (e.g. "tits" is someone's nickname). A nickname alone is NOT toxic. However, you must still flag genuinely aggressive language.
|
||||
|
||||
SCORING GUIDE — Be precise with scores:
|
||||
- 0.0-0.1: Completely harmless. Casual chat, jokes, "lmao", greetings, game talk, nicknames.
|
||||
- 0.2-0.3: Mildly edgy. Light trash-talk that's clearly playful ("you suck at this game lol").
|
||||
- 0.4-0.5: Moderate. Frustrated or heated language that could go either way. Insults without "lol/lmao" softening them.
|
||||
- 0.6-0.7: Aggressive. Direct insults, hostility, telling someone off. Even among friends, "fuck you, you piece of shit" is aggressive language that should score here regardless of context.
|
||||
- 0.8-1.0: Severely toxic. Threats, targeted harassment, telling someone to leave, attacking insecurities, sustained personal attacks.
|
||||
|
||||
IMPORTANT RULES:
|
||||
- "Tits" as a nickname = 0.0, not toxic.
|
||||
- Profanity ALONE (just "fuck" or "shit" with no target) = low score (0.0-0.1).
|
||||
- Profanity DIRECTED AT someone ("fuck you", "you piece of shit") = moderate-to-high score (0.5-0.7) even among friends.
|
||||
- Do NOT let friendly context excuse clearly aggressive language. Friends can still cross lines.
|
||||
- If a message contains BOTH a nickname AND an insult ("fuck you tits you piece of shit"), score the insult, not the nickname.
|
||||
- If the target message is just "lmao", "lol", an emoji, or a short neutral reaction, it is ALWAYS 0.0 regardless of what other people said before it.
|
||||
|
||||
Also determine if the message is on-topic (gaming, games, matches, strategy, LFG, etc.) or off-topic personal drama (relationship issues, personal feuds, venting about real-life problems, gossip about people outside the server).
|
||||
|
||||
Also assess the message's coherence — how well-formed, readable, and grammatically correct it is.
|
||||
- 0.9-1.0: Clear, well-written, normal for this user
|
||||
- 0.6-0.8: Some errors but still understandable (normal texting shortcuts like "u" and "ur" are fine — don't penalize those)
|
||||
- 0.3-0.5: Noticeably degraded — garbled words, missing letters, broken sentences beyond normal shorthand
|
||||
- 0.0-0.2: Nearly incoherent — can barely understand what they're trying to say
|
||||
|
||||
You may also be given NOTES about this user from prior interactions. Use these to calibrate your scoring — for example, if notes say "uses heavy profanity casually" then profanity alone should score lower for this user.
|
||||
|
||||
If you notice something noteworthy about this user's communication style, behavior, or patterns that would help future analysis, include it as a note_update. Only add genuinely useful observations — don't repeat what's already in the notes. If nothing new, leave note_update as null.
|
||||
|
||||
Use the report_analysis tool to report your analysis of the TARGET MESSAGE only."""
|
||||
SYSTEM_PROMPT = (_PROMPTS_DIR / "analysis.txt").read_text(encoding="utf-8")
|
||||
|
||||
ANALYSIS_TOOL = {
|
||||
"type": "function",
|
||||
|
||||
Reference in New Issue
Block a user