Files
Breehavior-Monitor/docs/plans/2026-02-26-conversational-memory-design.md
AJ Isaacs d652c32063 docs: add conversational memory design document
Outlines persistent memory system for making the bot a real conversational
participant that knows people and remembers past interactions. Uses existing
UserNotes column for permanent profiles and a new UserMemory table for
expiring context with LLM-assigned lifetimes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 12:41:28 -05:00

9.3 KiB

Conversational Memory Design

Goal

Make the bot a real conversational participant that knows people, remembers past interactions, can answer general questions, and gives input based on accumulated context. People should be able to ask it questions and get thoughtful answers informed by who they are and what's happened before.

Design Decisions

  • Memory approach: Structured memory tables in existing MSSQL database
  • Learning mode: Both passive (observing chat via sentiment analysis) and active (direct conversations)
  • Knowledge scope: General knowledge + server/people awareness (no web search)
  • Permanent memory: Stored in existing UserState.UserNotes column (repurposed as LLM-maintained profile)
  • Expiring memory: New UserMemory table for transient context with LLM-assigned expiration

Database Changes

Repurposed: UserState.UserNotes

No schema change needed. The column already exists as NVARCHAR(MAX). Currently stores timestamped observation lines (max 10). Will be repurposed as an LLM-maintained permanent profile summary — a compact paragraph of durable facts about a user.

Example content:

GTA Online grinder (rank 400+, wants to hit 500), sarcastic humor, works night shifts, hates battle royales. Has a dog named Rex. Banters with the bot, usually tries to get roasted. Been in the server since early 2024.

The LLM rewrites this field as a whole when new permanent facts emerge, rather than appending timestamped lines.

New Table: UserMemory

Stores expiring memories — transient context that's relevant for days or weeks but not forever.

CREATE TABLE UserMemory (
    Id          BIGINT IDENTITY(1,1) PRIMARY KEY,
    UserId      BIGINT NOT NULL,
    Memory      NVARCHAR(500) NOT NULL,
    Topics      NVARCHAR(200) NOT NULL,       -- comma-separated tags
    Importance  NVARCHAR(10) NOT NULL,         -- low, medium, high
    ExpiresAt   DATETIME2 NOT NULL,
    Source      NVARCHAR(20) NOT NULL,         -- 'chat' or 'passive'
    CreatedAt   DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME(),
    INDEX IX_UserMemory_UserId (UserId),
    INDEX IX_UserMemory_ExpiresAt (ExpiresAt)
)

Example rows:

Memory Topics Importance ExpiresAt Source
Frustrated about losing ranked matches in Warzone warzone,fps,frustration medium +7d passive
Said they're quitting Warzone for good warzone,fps high +30d chat
Drunk tonight, celebrating Friday personal,celebration low +1d chat
Excited about GTA DLC dropping next week gta,dlc medium +7d passive

Memory Extraction

From Direct Conversations (ChatCog)

After the bot sends a chat reply, a fire-and-forget background task calls the triage LLM to extract memories from the conversation. This does not block the reply.

New LLM tool definition:

MEMORY_EXTRACTION_TOOL = {
    "type": "function",
    "function": {
        "name": "extract_memories",
        "parameters": {
            "type": "object",
            "properties": {
                "memories": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "memory": {
                                "type": "string",
                                "description": "A concise fact or observation worth remembering."
                            },
                            "topics": {
                                "type": "array",
                                "items": {"type": "string"},
                                "description": "Topic tags for retrieval (e.g., 'gta', 'personal', 'warzone')."
                            },
                            "expiration": {
                                "type": "string",
                                "enum": ["1d", "3d", "7d", "30d", "permanent"],
                                "description": "How long this memory stays relevant. Use 'permanent' for stable facts about the person."
                            },
                            "importance": {
                                "type": "string",
                                "enum": ["low", "medium", "high"],
                                "description": "How important this memory is for future interactions."
                            }
                        },
                        "required": ["memory", "topics", "expiration", "importance"]
                    },
                    "description": "Memories to store. Only include genuinely new or noteworthy information."
                },
                "profile_update": {
                    "type": ["string", "null"],
                    "description": "If a permanent fact was learned, provide the full updated profile summary incorporating the new info. Null if no profile changes needed."
                }
            },
            "required": ["memories"]
        }
    }
}

The extraction prompt receives:

  • The conversation that just happened (from _chat_history)
  • The user's current profile (UserNotes)
  • Instructions to only extract genuinely new information

From Passive Observation (SentimentCog)

The existing note_update field from analysis results currently feeds DramaTracker.update_user_notes(). This will be enhanced:

  • If note_update contains a durable fact (the LLM can flag this), update UserNotes profile
  • If it's transient observation, insert into UserMemory with a 7d default expiration
  • The analysis tool's note_update field description gets updated to indicate whether the note is permanent or transient

Memory Retrieval at Chat Time

When building context for a chat reply, memories are pulled in layers and injected as a structured block:

Layer 1: Profile (always included)

profile = user_state.user_notes  # permanent profile summary

Layer 2: Recent Expiring Memories (last 5 by CreatedAt)

SELECT TOP 5 Memory, Topics, CreatedAt
FROM UserMemory
WHERE UserId = ? AND ExpiresAt > SYSUTCDATETIME()
ORDER BY CreatedAt DESC

Layer 3: Topic-Matched Memories

Extract keywords from the current message, match against Topics column:

SELECT TOP 5 Memory, Topics, CreatedAt
FROM UserMemory
WHERE UserId = ? AND ExpiresAt > SYSUTCDATETIME()
  AND (Topics LIKE '%gta%' OR Topics LIKE '%warzone%')  -- dynamic from message keywords
ORDER BY Importance DESC, CreatedAt DESC

Layer 4: Channel Bias

If in a game channel (e.g., #gta-online), add the game name as a topic filter to boost relevant memories.

Injected Context Format

[What you know about {username}:]
Profile: GTA grinder (rank 400+), sarcastic, works night shifts, hates BRs. Banters with the bot.
Recent: Said they're quitting Warzone (2 days ago) | Excited about GTA DLC (yesterday)
Relevant: Mentioned trying to hit rank 500 in GTA (3 weeks ago)

Target: ~200-400 tokens of memory context per chat interaction.

Memory Maintenance

Pruning (daily background task)

DELETE FROM UserMemory WHERE ExpiresAt < SYSUTCDATETIME()

Also enforce a per-user cap (50 memories). When exceeded, delete oldest low-importance memories first:

-- Delete excess memories beyond cap, keeping high importance longest
DELETE FROM UserMemory
WHERE Id IN (
    SELECT Id FROM UserMemory
    WHERE UserId = ?
    ORDER BY
        CASE Importance WHEN 'high' THEN 3 WHEN 'medium' THEN 2 ELSE 1 END,
        CreatedAt DESC
    OFFSET 50 ROWS
)

Profile Consolidation

When a permanent memory is extracted, the LLM provides an updated profile_update string that incorporates the new fact into the existing profile. This replaces UserNotes directly — no separate consolidation task needed.

Integration Changes

File Changes
utils/database.py Add UserMemory table creation in schema. Add CRUD: save_memory(), get_recent_memories(), get_memories_by_topics(), prune_expired_memories(), prune_excess_memories(). Update save_user_state() (no schema change needed).
utils/llm_client.py Add extract_memories() method with MEMORY_EXTRACTION_TOOL. Add MEMORY_EXTRACTION_PROMPT for the extraction system prompt.
utils/drama_tracker.py update_user_notes() changes from appending timestamped lines to replacing the full profile string when a profile update is provided. Keep backward compat for non-profile note_updates during transition.
cogs/chat.py At chat time: query DB for memories, build memory context block, inject into prompt. After reply: fire-and-forget memory extraction task.
cogs/sentiment/ Route note_update from analysis into UserMemory table (expiring) or UserNotes profile update (permanent).
bot.py Start daily memory pruning background task on bot ready.

What Stays the Same

  • In-memory _chat_history deque (10 turns per channel) for immediate conversation coherence
  • All existing moderation/analysis logic
  • Mode system and personality prompts (memory context is additive)
  • UserState table schema (no changes)
  • Existing DramaTracker hydration flow

Token Budget

Per chat interaction:

  • Profile summary: ~50-100 tokens
  • Recent memories (5): ~75-125 tokens
  • Topic-matched memories (5): ~75-125 tokens
  • Total memory context: ~200-350 tokens

Memory extraction call (background, triage model): ~500 input tokens, ~200 output tokens per conversation.