AJ Isaacs 2633bbf37a Initial commit
Add Discord Archive Manager project with:
- Entity Framework Core data models for Discord exports
- JSON import service for processing Discord chat exports
- Archive service for managing imported data
- Docker configuration for containerized deployment

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 12:26:38 -05:00
2026-01-20 12:26:38 -05:00
2026-01-20 12:26:38 -05:00
2026-01-20 12:26:38 -05:00
2026-01-20 12:26:38 -05:00
2026-01-20 12:26:38 -05:00
2026-01-20 12:26:38 -05:00

Discord Archive Manager

A .NET 8 console application that parses DiscordChatExporter JSON exports and stores them in MSSQL with content-hashed image storage.

Features

  • Parses DiscordChatExporter JSON exports
  • Stores messages, users, channels, attachments, embeds, and reactions in MSSQL
  • Content-addressed image storage using SHA256 hashing (deduplicates identical files)
  • Tracks user profile changes over time via snapshots
  • Archives processed JSON files
  • Idempotent processing (skips already-processed files)

Project Structure

DiscordArchiveManager/
├── src/DiscordArchiveManager/
│   ├── Program.cs              # Entry point
│   ├── appsettings.json        # Configuration
│   ├── Models/
│   │   ├── DiscordExport.cs    # JSON deserialization models
│   │   └── Entities/           # EF Core entities
│   ├── Data/
│   │   └── DiscordArchiveContext.cs
│   └── Services/
│       ├── JsonImportService.cs
│       ├── ImageHashService.cs
│       └── ArchiveService.cs
├── Dockerfile
├── docker-compose.yml
└── README.md

Database Schema

  • Guilds: Discord servers
  • Channels: Text channels within guilds
  • Users: Discord users (basic info)
  • UserSnapshots: Historical user profile data (nickname, color, avatar)
  • Messages: Chat messages
  • Attachments: Files attached to messages (stored with content hash)
  • Embeds: Rich embeds in messages
  • Reactions: Emoji reactions on messages
  • Mentions: User mentions in messages
  • ProcessedFiles: Tracking for imported files

Image Storage

Images are stored using a content-addressed system:

  1. Calculate SHA256 hash of the file
  2. Store at /images/{hash[0:2]}/{hash[2:4]}/{hash}.{ext}

Example: A file with hash a1b2c3d4e5f6... and extension .png is stored at:

/images/a1/b2/a1b2c3d4e5f6....png

Benefits:

  • Automatic deduplication (identical files share storage)
  • Even distribution across directories
  • Fast lookup by hash

Configuration

appsettings.json

{
  "ConnectionStrings": {
    "Discord": "Server=192.168.10.99;Database=DiscordArchive;User Id=sa;Password=YourPassword;TrustServerCertificate=true"
  },
  "Paths": {
    "InputDirectory": "/app/input",
    "ArchiveDirectory": "/app/archive",
    "ImageDirectory": "/app/images"
  }
}

Environment Variables

Configuration can also be set via environment variables:

  • ConnectionStrings__Discord: Database connection string
  • Paths__InputDirectory: Directory to scan for JSON files
  • Paths__ArchiveDirectory: Directory to move processed files
  • Paths__ImageDirectory: Directory for content-hashed images

Usage

With Docker Compose

  1. Create input/archive/images directories:

    mkdir -p input archive images
    
  2. Place DiscordChatExporter JSON exports in the input directory

  3. Update the connection string in docker-compose.yml

  4. Build and run:

    docker compose build
    docker compose up
    

Without Docker

  1. Ensure .NET 8 SDK is installed

  2. Update appsettings.json with your configuration

  3. Build and run:

    cd src/DiscordArchiveManager
    dotnet run
    

DiscordChatExporter Export Format

This tool expects JSON exports from DiscordChatExporter.

When exporting, ensure:

  • Format: JSON
  • "Download assets" is enabled (for local attachment storage)

The tool expects the _Files directory to be alongside the JSON file:

exports/
├── general-2024-01-15.json
└── general-2024-01-15.json_Files/
    ├── attachment1.png
    └── avatar123.webp

Processing Flow

  1. Scan input directory for *.json files
  2. For each unprocessed file:
    • Parse JSON into model objects
    • Upsert Guild and Channel (idempotent)
    • Upsert Users and create snapshots for profile changes
    • Insert Messages (skip if ID exists)
    • Process attachments:
      • Calculate SHA256 hash
      • Copy to content-hashed location if new
      • Reference existing path if duplicate
    • Process embeds, reactions, and mentions
  3. Archive JSON file and _Files folder
  4. Record in ProcessedFiles table

Re-running

The tool is safe to run multiple times:

  • Already-processed files are skipped (tracked in ProcessedFiles table)
  • Existing messages are not duplicated (checked by Discord message ID)
  • Duplicate images are not re-copied (checked by content hash)
Description
No description provided
Readme 41 KiB
Languages
C# 98.6%
Dockerfile 1.4%