docs: add conversational memory design document

Outlines persistent memory system for making the bot a real conversational
participant that knows people and remembers past interactions. Uses existing
UserNotes column for permanent profiles and a new UserMemory table for
expiring context with LLM-assigned lifetimes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-26 12:41:28 -05:00
parent 196f8c8ae5
commit d652c32063

View File

@@ -0,0 +1,216 @@
# Conversational Memory Design
## Goal
Make the bot a real conversational participant that knows people, remembers past interactions, can answer general questions, and gives input based on accumulated context. People should be able to ask it questions and get thoughtful answers informed by who they are and what's happened before.
## Design Decisions
- **Memory approach**: Structured memory tables in existing MSSQL database
- **Learning mode**: Both passive (observing chat via sentiment analysis) and active (direct conversations)
- **Knowledge scope**: General knowledge + server/people awareness (no web search)
- **Permanent memory**: Stored in existing `UserState.UserNotes` column (repurposed as LLM-maintained profile)
- **Expiring memory**: New `UserMemory` table for transient context with LLM-assigned expiration
## Database Changes
### Repurposed: `UserState.UserNotes`
No schema change needed. The column already exists as `NVARCHAR(MAX)`. Currently stores timestamped observation lines (max 10). Will be repurposed as an LLM-maintained **permanent profile summary** — a compact paragraph of durable facts about a user.
Example content:
```
GTA Online grinder (rank 400+, wants to hit 500), sarcastic humor, works night shifts, hates battle royales. Has a dog named Rex. Banters with the bot, usually tries to get roasted. Been in the server since early 2024.
```
The LLM rewrites this field as a whole when new permanent facts emerge, rather than appending timestamped lines.
### New Table: `UserMemory`
Stores expiring memories — transient context that's relevant for days or weeks but not forever.
```sql
CREATE TABLE UserMemory (
Id BIGINT IDENTITY(1,1) PRIMARY KEY,
UserId BIGINT NOT NULL,
Memory NVARCHAR(500) NOT NULL,
Topics NVARCHAR(200) NOT NULL, -- comma-separated tags
Importance NVARCHAR(10) NOT NULL, -- low, medium, high
ExpiresAt DATETIME2 NOT NULL,
Source NVARCHAR(20) NOT NULL, -- 'chat' or 'passive'
CreatedAt DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME(),
INDEX IX_UserMemory_UserId (UserId),
INDEX IX_UserMemory_ExpiresAt (ExpiresAt)
)
```
Example rows:
| Memory | Topics | Importance | ExpiresAt | Source |
|--------|--------|------------|-----------|--------|
| Frustrated about losing ranked matches in Warzone | warzone,fps,frustration | medium | +7d | passive |
| Said they're quitting Warzone for good | warzone,fps | high | +30d | chat |
| Drunk tonight, celebrating Friday | personal,celebration | low | +1d | chat |
| Excited about GTA DLC dropping next week | gta,dlc | medium | +7d | passive |
## Memory Extraction
### From Direct Conversations (ChatCog)
After the bot sends a chat reply, a **fire-and-forget background task** calls the triage LLM to extract memories from the conversation. This does not block the reply.
New LLM tool definition:
```python
MEMORY_EXTRACTION_TOOL = {
"type": "function",
"function": {
"name": "extract_memories",
"parameters": {
"type": "object",
"properties": {
"memories": {
"type": "array",
"items": {
"type": "object",
"properties": {
"memory": {
"type": "string",
"description": "A concise fact or observation worth remembering."
},
"topics": {
"type": "array",
"items": {"type": "string"},
"description": "Topic tags for retrieval (e.g., 'gta', 'personal', 'warzone')."
},
"expiration": {
"type": "string",
"enum": ["1d", "3d", "7d", "30d", "permanent"],
"description": "How long this memory stays relevant. Use 'permanent' for stable facts about the person."
},
"importance": {
"type": "string",
"enum": ["low", "medium", "high"],
"description": "How important this memory is for future interactions."
}
},
"required": ["memory", "topics", "expiration", "importance"]
},
"description": "Memories to store. Only include genuinely new or noteworthy information."
},
"profile_update": {
"type": ["string", "null"],
"description": "If a permanent fact was learned, provide the full updated profile summary incorporating the new info. Null if no profile changes needed."
}
},
"required": ["memories"]
}
}
}
```
The extraction prompt receives:
- The conversation that just happened (from `_chat_history`)
- The user's current profile (`UserNotes`)
- Instructions to only extract genuinely new information
### From Passive Observation (SentimentCog)
The existing `note_update` field from analysis results currently feeds `DramaTracker.update_user_notes()`. This will be enhanced:
- If `note_update` contains a durable fact (the LLM can flag this), update `UserNotes` profile
- If it's transient observation, insert into `UserMemory` with a 7d default expiration
- The analysis tool's `note_update` field description gets updated to indicate whether the note is permanent or transient
## Memory Retrieval at Chat Time
When building context for a chat reply, memories are pulled in layers and injected as a structured block:
### Layer 1: Profile (always included)
```python
profile = user_state.user_notes # permanent profile summary
```
### Layer 2: Recent Expiring Memories (last 5 by CreatedAt)
```sql
SELECT TOP 5 Memory, Topics, CreatedAt
FROM UserMemory
WHERE UserId = ? AND ExpiresAt > SYSUTCDATETIME()
ORDER BY CreatedAt DESC
```
### Layer 3: Topic-Matched Memories
Extract keywords from the current message, match against `Topics` column:
```sql
SELECT TOP 5 Memory, Topics, CreatedAt
FROM UserMemory
WHERE UserId = ? AND ExpiresAt > SYSUTCDATETIME()
AND (Topics LIKE '%gta%' OR Topics LIKE '%warzone%') -- dynamic from message keywords
ORDER BY Importance DESC, CreatedAt DESC
```
### Layer 4: Channel Bias
If in a game channel (e.g., `#gta-online`), add the game name as a topic filter to boost relevant memories.
### Injected Context Format
```
[What you know about {username}:]
Profile: GTA grinder (rank 400+), sarcastic, works night shifts, hates BRs. Banters with the bot.
Recent: Said they're quitting Warzone (2 days ago) | Excited about GTA DLC (yesterday)
Relevant: Mentioned trying to hit rank 500 in GTA (3 weeks ago)
```
Target: ~200-400 tokens of memory context per chat interaction.
## Memory Maintenance
### Pruning (daily background task)
```sql
DELETE FROM UserMemory WHERE ExpiresAt < SYSUTCDATETIME()
```
Also enforce a per-user cap (50 memories). When exceeded, delete oldest low-importance memories first:
```sql
-- Delete excess memories beyond cap, keeping high importance longest
DELETE FROM UserMemory
WHERE Id IN (
SELECT Id FROM UserMemory
WHERE UserId = ?
ORDER BY
CASE Importance WHEN 'high' THEN 3 WHEN 'medium' THEN 2 ELSE 1 END,
CreatedAt DESC
OFFSET 50 ROWS
)
```
### Profile Consolidation
When a `permanent` memory is extracted, the LLM provides an updated `profile_update` string that incorporates the new fact into the existing profile. This replaces `UserNotes` directly — no separate consolidation task needed.
## Integration Changes
| File | Changes |
|------|---------|
| `utils/database.py` | Add `UserMemory` table creation in schema. Add CRUD: `save_memory()`, `get_recent_memories()`, `get_memories_by_topics()`, `prune_expired_memories()`, `prune_excess_memories()`. Update `save_user_state()` (no schema change needed). |
| `utils/llm_client.py` | Add `extract_memories()` method with `MEMORY_EXTRACTION_TOOL`. Add `MEMORY_EXTRACTION_PROMPT` for the extraction system prompt. |
| `utils/drama_tracker.py` | `update_user_notes()` changes from appending timestamped lines to replacing the full profile string when a profile update is provided. Keep backward compat for non-profile note_updates during transition. |
| `cogs/chat.py` | At chat time: query DB for memories, build memory context block, inject into prompt. After reply: fire-and-forget memory extraction task. |
| `cogs/sentiment/` | Route `note_update` from analysis into `UserMemory` table (expiring) or `UserNotes` profile update (permanent). |
| `bot.py` | Start daily memory pruning background task on bot ready. |
## What Stays the Same
- In-memory `_chat_history` deque (10 turns per channel) for immediate conversation coherence
- All existing moderation/analysis logic
- Mode system and personality prompts (memory context is additive)
- `UserState` table schema (no changes)
- Existing DramaTracker hydration flow
## Token Budget
Per chat interaction:
- Profile summary: ~50-100 tokens
- Recent memories (5): ~75-125 tokens
- Topic-matched memories (5): ~75-125 tokens
- **Total memory context: ~200-350 tokens**
Memory extraction call (background, triage model): ~500 input tokens, ~200 output tokens per conversation.