Reduce false positives from legitimate service traffic

Fix .git/ regex pattern to require leading slash, preventing Gitea
git-protocol URLs from triggering "Sensitive File Probe" alerts.
Add infrastructure context to the LLM system prompt describing
Gitea, Nextcloud, Immich, and Gotify traffic patterns so the
LLM does not flag normal self-hosted service activity.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-09 21:07:37 -05:00
parent 5b86573b62
commit b13e69a44f

View File

@@ -56,7 +56,7 @@ IMMEDIATE_ALERT_PATTERNS = [
(r"(?:;\s*(?:ls|cat|wget|curl|bash|sh|nc)\s)", "Command Injection"),
(r"(?:wp-login|wp-admin|xmlrpc\.php).*(?:POST|HEAD)", "WordPress Scan"),
(r"(?:phpmyadmin|pma|mysqladmin)", "DB Admin Scan"),
(r"(?:\.env|\.git/|\.aws/|\.ssh/)", "Sensitive File Probe"),
(r"(?:\.env|/\.git/|\.aws/|\.ssh/)", "Sensitive File Probe"),
]
SYSTEM_PROMPT = """/no_think
@@ -89,6 +89,19 @@ After your analysis (and any tool calls), respond with JSON only (no markdown, n
If nothing suspicious, return: {"suspicious": false, "findings": [], "summary": "No suspicious activity detected"}
IMPORTANT CONTEXT about the infrastructure behind this reverse proxy (Traefik):
- Gitea (git server): Expect git clone/push/pull traffic with URLs like /user/repo.git/info/refs, /user/repo.git/git-upload-pack, etc. This is NORMAL git protocol traffic, not sensitive file probing. Gitea also serves web UI pages for browsing repositories.
- Nextcloud (cloud storage): Expect heavy API traffic including WebDAV, OCS API calls, app-specific endpoints (bookmarks, calendar, contacts sync). Burst of requests from a single IP to Nextcloud is normal client sync behavior, NOT reconnaissance.
- Immich (photo management): Expect API calls for photo sync and uploads from mobile/desktop clients.
- Gotify (push notifications): Expect WebSocket connections and message API calls.
- Other self-hosted services behind this proxy generate legitimate automated traffic patterns.
Do NOT flag the following as suspicious:
- Git protocol operations to Gitea repositories (even with 401 responses — git auth negotiation starts with a 401)
- Nextcloud sync bursts (many rapid requests from one IP to Nextcloud endpoints)
- Authenticated API traffic to any of the above services
- Standard browser navigation patterns to any hosted service
Be conservative - normal traffic like health checks, static assets, and authenticated user activity is not suspicious."""
# Tool definitions for the LLM (OpenAI function-calling format)