docs: document spam detection features and new MCP tools

Add spam detection architecture, detection patterns, attachment risk scoring, and configuration details to CLAUDE.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 11:06:58 -05:00
parent f59b610d0b
commit c72e81601c
1 changed files with 59 additions and 0 deletions
@@ -25,12 +25,71 @@ This is an MCP (Model Context Protocol) server that provides Outlook email searc
 - `EmailSearchTools.cs` - MCP tool implementations decorated with `[McpServerTool]`:
  - `SearchEmails` - Search emails with filters (keywords, sender, subject, date range, folder, attachments, importance, category, flag status)
  - `ReadEmail` - Retrieve full email body by subject and date
+  - `MoveToJunk` - Move an email to the Junk folder
+  - `AnalyzeSpam` - Analyze a specific email for spam indicators with detailed report
+  - `ScanForSpam` - Scan recent emails and return spam scores for potential spam
 - `SearchFilters.cs` - Filter parameter container for email searches
 - `EmailResult.cs` - DTO for search results with factory method `FromMailItem()`

+**Spam Detection (`SpamDetection/` folder):**
+
+- `SpamDetector.cs` - Core rule-based spam detection engine with 50+ heuristic patterns
+- `SpamFeatures.cs` - Feature extraction model for spam analysis
+- `SpamAnalysisResult.cs` - Result container with score, likelihood, and red flags
+- `SpamDetectorConfig.cs` - Configuration model with customizable weights and keyword lists
+- `UrlAnalyzer.cs` - URL analysis (IP-based links, URL shorteners)
+- `AttachmentAnalyzer.cs` - Attachment risk scoring by file type
+- `FeatureExtractors.cs` - Helper methods for URL and header extraction
+
 **Dependencies:**

 - `ModelContextProtocol` - MCP SDK for .NET
 - `NetOfficeFw.Outlook` - COM interop wrapper for Outlook automation

 **Platform:** Windows-only (.NET 9.0-windows) due to Outlook COM dependency
+
+## Spam Detection Features
+
+The spam detection system uses a weighted scoring approach (0.0-1.0) with the following detection patterns:
+
+**Authentication Checks:**
+- SPF, DKIM, DMARC authentication failures
+- Reply-To domain mismatch
+
+**Identity Spoofing:**
+- Display name impersonation (vendor name + different domain)
+- Subject domain impersonation
+- Unicode/homoglyph attacks in domains
+- Generic sender names (noreply, notification, etc.)
+- Company subdomain spoofing (e.g., company.fakevoicemail.net)
+
+**Link/URL Analysis:**
+- IP address-based URLs
+- URL shorteners (bit.ly, tinyurl, etc.)
+- Suspicious TLDs (.xyz, .top, .click, etc.)
+
+**Content Red Flags:**
+- Keyword bait (invoice, urgent, verify, etc.)
+- Placeholder text (failed mail merge)
+- Single link with minimal text
+- Tracking pixels (1x1 images)
+- Zero-width Unicode characters (filter evasion)
+- Random reference IDs in subject
+- Timestamps in subject (automation indicator)
+
+**Attachment Risk:**
+- Weighted scoring by file type (0.0-1.0)
+- Critical: .exe, .scr (1.0)
+- High: .bat, .cmd, .vbs, .js (0.9-0.95)
+- Medium: .docm, .xlsm, .html, .iso (0.6-0.8)
+- Low: .zip, .rar (0.3-0.35)
+
+**Advanced Phishing Patterns:**
+- Fake quarantine/spam reports
+- Fake voicemail notifications
+- Fake system notifications (verify email, account suspended)
+- Cold email solicitation (SEO, web design spam)
+
+**Configuration:**
+
+Optional `SpamDetectorConfig.json` and `BlockList.txt` files can be placed in the application directory to customize detection patterns, keywords, trusted domains, and score weights.