feat: add jealousy/possessiveness detection as toxicity category
LLM can now flag possessive name-dropping, territorial behavior, and jealousy signals when users mention others not in the conversation. Scores feed into existing drama pipeline for warnings/mutes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -37,6 +37,7 @@ ANALYSIS_TOOL = {
|
||||
"hostile",
|
||||
"manipulative",
|
||||
"sexual_vulgar",
|
||||
"jealousy",
|
||||
"none",
|
||||
],
|
||||
},
|
||||
@@ -130,6 +131,7 @@ CONVERSATION_TOOL = {
|
||||
"hostile",
|
||||
"manipulative",
|
||||
"sexual_vulgar",
|
||||
"jealousy",
|
||||
"none",
|
||||
],
|
||||
},
|
||||
|
||||
Reference in New Issue
Block a user