feat: add warning expiration and exclude moderated messages from context

Warning flag now auto-expires after a configurable duration
(warning_expiration_minutes, default 30m). After expiry, the user must
be re-warned before a mute can be issued.

Messages that triggered moderation actions (warnings/mutes) are now
excluded from the LLM context window in both buffered analysis and
mention scans, preventing already-actioned content from influencing
future scoring. Uses in-memory tracking plus bot reaction fallback
for post-restart coverage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-28 13:39:49 -05:00
parent 36df4cf5a6
commit eb7eb81621
6 changed files with 86 additions and 18 deletions

View File

@@ -44,6 +44,7 @@ timeouts:
escalation_minutes: [30, 60, 120, 240] # Escalating timeout durations
offense_reset_minutes: 1440 # Reset offense counter after this much good behavior (24h)
warning_cooldown_minutes: 5 # Don't warn same user more than once per this window
warning_expiration_minutes: 30 # Warning expires after this long — user must be re-warned before mute
messages:
warning: "Easy there, {username}. The Breehavior Monitor is watching. \U0001F440"