Add Discord Archive Manager project with: - Entity Framework Core data models for Discord exports - JSON import service for processing Discord chat exports - Archive service for managing imported data - Docker configuration for containerized deployment Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Discord Archive Manager
A .NET 8 console application that parses DiscordChatExporter JSON exports and stores them in MSSQL with content-hashed image storage.
Features
- Parses DiscordChatExporter JSON exports
- Stores messages, users, channels, attachments, embeds, and reactions in MSSQL
- Content-addressed image storage using SHA256 hashing (deduplicates identical files)
- Tracks user profile changes over time via snapshots
- Archives processed JSON files
- Idempotent processing (skips already-processed files)
Project Structure
DiscordArchiveManager/
├── src/DiscordArchiveManager/
│ ├── Program.cs # Entry point
│ ├── appsettings.json # Configuration
│ ├── Models/
│ │ ├── DiscordExport.cs # JSON deserialization models
│ │ └── Entities/ # EF Core entities
│ ├── Data/
│ │ └── DiscordArchiveContext.cs
│ └── Services/
│ ├── JsonImportService.cs
│ ├── ImageHashService.cs
│ └── ArchiveService.cs
├── Dockerfile
├── docker-compose.yml
└── README.md
Database Schema
- Guilds: Discord servers
- Channels: Text channels within guilds
- Users: Discord users (basic info)
- UserSnapshots: Historical user profile data (nickname, color, avatar)
- Messages: Chat messages
- Attachments: Files attached to messages (stored with content hash)
- Embeds: Rich embeds in messages
- Reactions: Emoji reactions on messages
- Mentions: User mentions in messages
- ProcessedFiles: Tracking for imported files
Image Storage
Images are stored using a content-addressed system:
- Calculate SHA256 hash of the file
- Store at
/images/{hash[0:2]}/{hash[2:4]}/{hash}.{ext}
Example: A file with hash a1b2c3d4e5f6... and extension .png is stored at:
/images/a1/b2/a1b2c3d4e5f6....png
Benefits:
- Automatic deduplication (identical files share storage)
- Even distribution across directories
- Fast lookup by hash
Configuration
appsettings.json
{
"ConnectionStrings": {
"Discord": "Server=192.168.10.99;Database=DiscordArchive;User Id=sa;Password=YourPassword;TrustServerCertificate=true"
},
"Paths": {
"InputDirectory": "/app/input",
"ArchiveDirectory": "/app/archive",
"ImageDirectory": "/app/images"
}
}
Environment Variables
Configuration can also be set via environment variables:
ConnectionStrings__Discord: Database connection stringPaths__InputDirectory: Directory to scan for JSON filesPaths__ArchiveDirectory: Directory to move processed filesPaths__ImageDirectory: Directory for content-hashed images
Usage
With Docker Compose
-
Create input/archive/images directories:
mkdir -p input archive images -
Place DiscordChatExporter JSON exports in the
inputdirectory -
Update the connection string in
docker-compose.yml -
Build and run:
docker compose build docker compose up
Without Docker
-
Ensure .NET 8 SDK is installed
-
Update
appsettings.jsonwith your configuration -
Build and run:
cd src/DiscordArchiveManager dotnet run
DiscordChatExporter Export Format
This tool expects JSON exports from DiscordChatExporter.
When exporting, ensure:
- Format: JSON
- "Download assets" is enabled (for local attachment storage)
The tool expects the _Files directory to be alongside the JSON file:
exports/
├── general-2024-01-15.json
└── general-2024-01-15.json_Files/
├── attachment1.png
└── avatar123.webp
Processing Flow
- Scan input directory for
*.jsonfiles - For each unprocessed file:
- Parse JSON into model objects
- Upsert Guild and Channel (idempotent)
- Upsert Users and create snapshots for profile changes
- Insert Messages (skip if ID exists)
- Process attachments:
- Calculate SHA256 hash
- Copy to content-hashed location if new
- Reference existing path if duplicate
- Process embeds, reactions, and mentions
- Archive JSON file and
_Filesfolder - Record in ProcessedFiles table
Re-running
The tool is safe to run multiple times:
- Already-processed files are skipped (tracked in ProcessedFiles table)
- Existing messages are not duplicated (checked by Discord message ID)
- Duplicate images are not re-copied (checked by content hash)