Files

AJ Isaacs d652c32063 docs: add conversational memory design document

Outlines persistent memory system for making the bot a real conversational
participant that knows people and remembers past interactions. Uses existing
UserNotes column for permanent profiles and a new UserMemory table for
expiring context with LLM-assigned lifetimes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-26 12:41:28 -05:00

9.3 KiB

Raw Permalink Blame History

Conversational Memory Design

Goal

Make the bot a real conversational participant that knows people, remembers past interactions, can answer general questions, and gives input based on accumulated context. People should be able to ask it questions and get thoughtful answers informed by who they are and what's happened before.

Design Decisions

Memory approach: Structured memory tables in existing MSSQL database
Learning mode: Both passive (observing chat via sentiment analysis) and active (direct conversations)
Knowledge scope: General knowledge + server/people awareness (no web search)
Permanent memory: Stored in existing UserState.UserNotes column (repurposed as LLM-maintained profile)
Expiring memory: New UserMemory table for transient context with LLM-assigned expiration

Database Changes

Repurposed: `UserState.UserNotes`

No schema change needed. The column already exists as NVARCHAR(MAX). Currently stores timestamped observation lines (max 10). Will be repurposed as an LLM-maintained permanent profile summary — a compact paragraph of durable facts about a user.

Example content:

GTA Online grinder (rank 400+, wants to hit 500), sarcastic humor, works night shifts, hates battle royales. Has a dog named Rex. Banters with the bot, usually tries to get roasted. Been in the server since early 2024.

The LLM rewrites this field as a whole when new permanent facts emerge, rather than appending timestamped lines.

New Table: `UserMemory`

Stores expiring memories — transient context that's relevant for days or weeks but not forever.

CREATE TABLE UserMemory (
    Id          BIGINT IDENTITY(1,1) PRIMARY KEY,
    UserId      BIGINT NOT NULL,
    Memory      NVARCHAR(500) NOT NULL,
    Topics      NVARCHAR(200) NOT NULL,       -- comma-separated tags
    Importance  NVARCHAR(10) NOT NULL,         -- low, medium, high
    ExpiresAt   DATETIME2 NOT NULL,
    Source      NVARCHAR(20) NOT NULL,         -- 'chat' or 'passive'
    CreatedAt   DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME(),
    INDEX IX_UserMemory_UserId (UserId),
    INDEX IX_UserMemory_ExpiresAt (ExpiresAt)
)

Example rows:

Memory	Topics	Importance	ExpiresAt	Source
Frustrated about losing ranked matches in Warzone	warzone,fps,frustration	medium	+7d	passive
Said they're quitting Warzone for good	warzone,fps	high	+30d	chat
Drunk tonight, celebrating Friday	personal,celebration	low	+1d	chat
Excited about GTA DLC dropping next week	gta,dlc	medium	+7d	passive

Memory Extraction

From Direct Conversations (ChatCog)

After the bot sends a chat reply, a fire-and-forget background task calls the triage LLM to extract memories from the conversation. This does not block the reply.

New LLM tool definition:

MEMORY_EXTRACTION_TOOL = {
    "type": "function",
    "function": {
        "name": "extract_memories",
        "parameters": {
            "type": "object",
            "properties": {
                "memories": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "memory": {
                                "type": "string",
                                "description": "A concise fact or observation worth remembering."
                            },
                            "topics": {
                                "type": "array",
                                "items": {"type": "string"},
                                "description": "Topic tags for retrieval (e.g., 'gta', 'personal', 'warzone')."
                            },
                            "expiration": {
                                "type": "string",
                                "enum": ["1d", "3d", "7d", "30d", "permanent"],
                                "description": "How long this memory stays relevant. Use 'permanent' for stable facts about the person."
                            },
                            "importance": {
                                "type": "string",
                                "enum": ["low", "medium", "high"],
                                "description": "How important this memory is for future interactions."
                            }
                        },
                        "required": ["memory", "topics", "expiration", "importance"]
                    },
                    "description": "Memories to store. Only include genuinely new or noteworthy information."
                },
                "profile_update": {
                    "type": ["string", "null"],
                    "description": "If a permanent fact was learned, provide the full updated profile summary incorporating the new info. Null if no profile changes needed."
                }
            },
            "required": ["memories"]
        }
    }
}

The extraction prompt receives:

The conversation that just happened (from _chat_history)
The user's current profile (UserNotes)
Instructions to only extract genuinely new information

From Passive Observation (SentimentCog)

The existing note_update field from analysis results currently feeds DramaTracker.update_user_notes(). This will be enhanced:

If note_update contains a durable fact (the LLM can flag this), update UserNotes profile
If it's transient observation, insert into UserMemory with a 7d default expiration
The analysis tool's note_update field description gets updated to indicate whether the note is permanent or transient

Memory Retrieval at Chat Time

When building context for a chat reply, memories are pulled in layers and injected as a structured block:

Layer 1: Profile (always included)

profile = user_state.user_notes  # permanent profile summary

Layer 2: Recent Expiring Memories (last 5 by CreatedAt)

SELECT TOP 5 Memory, Topics, CreatedAt
FROM UserMemory
WHERE UserId = ? AND ExpiresAt > SYSUTCDATETIME()
ORDER BY CreatedAt DESC

Layer 3: Topic-Matched Memories

Extract keywords from the current message, match against Topics column:

SELECT TOP 5 Memory, Topics, CreatedAt
FROM UserMemory
WHERE UserId = ? AND ExpiresAt > SYSUTCDATETIME()
  AND (Topics LIKE '%gta%' OR Topics LIKE '%warzone%')  -- dynamic from message keywords
ORDER BY Importance DESC, CreatedAt DESC

Layer 4: Channel Bias

If in a game channel (e.g., #gta-online), add the game name as a topic filter to boost relevant memories.

Injected Context Format

[What you know about {username}:]
Profile: GTA grinder (rank 400+), sarcastic, works night shifts, hates BRs. Banters with the bot.
Recent: Said they're quitting Warzone (2 days ago) | Excited about GTA DLC (yesterday)
Relevant: Mentioned trying to hit rank 500 in GTA (3 weeks ago)

Target: ~200-400 tokens of memory context per chat interaction.

Memory Maintenance

Pruning (daily background task)

DELETE FROM UserMemory WHERE ExpiresAt < SYSUTCDATETIME()

Also enforce a per-user cap (50 memories). When exceeded, delete oldest low-importance memories first:

-- Delete excess memories beyond cap, keeping high importance longest
DELETE FROM UserMemory
WHERE Id IN (
    SELECT Id FROM UserMemory
    WHERE UserId = ?
    ORDER BY
        CASE Importance WHEN 'high' THEN 3 WHEN 'medium' THEN 2 ELSE 1 END,
        CreatedAt DESC
    OFFSET 50 ROWS
)

Profile Consolidation

When a permanent memory is extracted, the LLM provides an updated profile_update string that incorporates the new fact into the existing profile. This replaces UserNotes directly — no separate consolidation task needed.

Integration Changes

File	Changes
`utils/database.py`	Add `UserMemory` table creation in schema. Add CRUD: `save_memory()`, `get_recent_memories()`, `get_memories_by_topics()`, `prune_expired_memories()`, `prune_excess_memories()`. Update `save_user_state()` (no schema change needed).
`utils/llm_client.py`	Add `extract_memories()` method with `MEMORY_EXTRACTION_TOOL`. Add `MEMORY_EXTRACTION_PROMPT` for the extraction system prompt.
`utils/drama_tracker.py`	`update_user_notes()` changes from appending timestamped lines to replacing the full profile string when a profile update is provided. Keep backward compat for non-profile note_updates during transition.
`cogs/chat.py`	At chat time: query DB for memories, build memory context block, inject into prompt. After reply: fire-and-forget memory extraction task.
`cogs/sentiment/`	Route `note_update` from analysis into `UserMemory` table (expiring) or `UserNotes` profile update (permanent).
`bot.py`	Start daily memory pruning background task on bot ready.

What Stays the Same

In-memory _chat_history deque (10 turns per channel) for immediate conversation coherence
All existing moderation/analysis logic
Mode system and personality prompts (memory context is additive)
UserState table schema (no changes)
Existing DramaTracker hydration flow

Token Budget

Per chat interaction:

Profile summary: ~50-100 tokens
Recent memories (5): ~75-125 tokens
Topic-matched memories (5): ~75-125 tokens
Total memory context: ~200-350 tokens

Memory extraction call (background, triage model): ~500 input tokens, ~200 output tokens per conversation.

9.3 KiB Raw Permalink Blame History