docs: Add comprehensive messaging and memory lifecycle documentation by elmorem · Pull Request #41 · elmorem/ContextIQ

elmorem · 2025-12-12T03:24:08Z

Summary

This PR adds two comprehensive technical deep dive documentation files that explain the core engines of ContextIQ: the messaging system and the complete memory lifecycle from sessions to consolidated memories.

New Documentation Files

1. docs/MESSAGING.md (3,681 lines, 101KB)

Complete technical guide to the RabbitMQ messaging system including:

Architecture & Configuration:

MessagingSettings with all environment variables
RabbitMQClient connection management with auto-reconnection
Queue and exchange declarations
Dead letter queue configuration

Core Components:

RabbitMQClient: Connection pooling, exchange/queue declaration, health checks
MessagePublisher: Publishing with priority, persistence, correlation IDs, RPC pattern
MessageConsumer: Consuming with prefetch, auto-ack, error handling, graceful shutdown
Queue Definitions: EXTRACTION_REQUESTS, CONSOLIDATION_REQUESTS, events, DLQ

Message Flows:

Session event → Memory extraction request
Memory extraction → Consolidation trigger
Event publishing and consumption

Queue Patterns:

Work queue pattern for load balancing
Event queue pattern for pub/sub
Dead letter queue for failed messages

Reliability & Performance:

Publisher confirms for guaranteed delivery
Message persistence and durability
Prefetch control for flow management
Connection recovery and retry logic
Batch operations and connection pooling
Production clustering and HA setup

Documentation Quality:

100+ code examples covering all use cases
ASCII diagrams of message flows
Complete environment variable reference
Production deployment patterns
Monitoring and health check setup
Comprehensive troubleshooting guide

2. docs/MEMORY_LIFECYCLE.md (3,848 lines, 122KB)

Complete technical guide to the memory lifecycle - how ContextIQ transforms sessions into memories:

The Six Phases:

Session Events Collection - Conversation capture with structured events
Memory Extraction - LLM-powered extraction with Anthropic Claude
Embedding Generation - OpenAI text-embedding-3-small (1536 dimensions)
Memory Storage - PostgreSQL database + Qdrant vector store
Memory Consolidation - Similarity detection and merging with cosine similarity
Memory Retrieval - Semantic search and relevance ranking

Memory Extraction Pipeline:

MemoryGenerationWorker: Background worker consuming from RabbitMQ
MemoryGenerationProcessor: 4-step pipeline orchestration
1. Fetch events from Sessions Service (HTTP)
2. Extract memories using ExtractionEngine (LLM)
3. Generate embeddings using EmbeddingService
4. Save to Memory Service with embeddings
ExtractionEngine: LLM-powered extraction with structured response schema
- Anthropic Claude 3.5 Sonnet
- 8 memory categories: preference, fact, goal, habit, relationship, professional, location, temporal
- Confidence scoring (0.0-1.0)
- Memory validation
LLM Integration: Structured extraction with response schema, few-shot examples

Memory Consolidation Pipeline:

ConsolidationWorker: Background worker for deduplication
ConsolidationProcessor: 5-step consolidation orchestration
1. Fetch memories from Memory Service by scope
2. Convert to consolidation format
3. Run ConsolidationEngine (similarity detection)
4. Generate embeddings for merged memories
5. Save consolidated memories
ConsolidationEngine: Similarity-based merging
- Cosine similarity calculation from embeddings
- Similarity threshold: 0.85 (configurable)
- Merge strategies: highest_confidence, most_recent, longest
- Conflict detection for contradictory memories (0.7-0.85 similarity)
- Confidence boost for merged memories (+0.1)

Technical Implementation Details:

Complete data models: MemoryGenerationRequest, ExtractionResult, Memory, ConsolidationRequest, MergedMemory
Configuration reference for all tunable parameters
Memory quality: confidence scores, importance scoring, category classification
Performance considerations: batching, async operations, worker concurrency
Database optimization: indexing, query patterns, pagination

Documentation Quality:

End-to-end ASCII flow diagrams
Complete data model examples
100+ comprehensive code examples
Step-by-step extraction and consolidation walkthroughs
Monitoring and debugging guides
Production best practices (triggers, error handling, data quality)
Troubleshooting guide with solutions for common issues

3. docs/README.md (updated)

Added "Technical Deep Dives" section with:
Added new use cases:
- "...understand how messaging works" with navigation to messaging docs
- "...understand how memories are created" with navigation to memory lifecycle docs

Implementation Coverage

Messaging System Documentation

Based on actual implementation from:

shared/messaging/config.py - MessagingSettings configuration
shared/messaging/rabbitmq_client.py - RabbitMQClient implementation
shared/messaging/publisher.py - MessagePublisher
shared/messaging/consumer.py - MessageConsumer
shared/messaging/queues.py - Queue definitions
workers/memory_generation/worker.py - Extraction worker
workers/consolidation/worker.py - Consolidation worker

Memory Lifecycle Documentation

Based on actual implementation from:

workers/memory_generation/worker.py - MemoryGenerationWorker
workers/memory_generation/processor.py - Extraction pipeline
shared/extraction/engine.py - ExtractionEngine with LLM
shared/extraction/prompts.py - LLM prompts and schemas
workers/consolidation/worker.py - ConsolidationWorker
workers/consolidation/processor.py - Consolidation pipeline
shared/consolidation/engine.py - Similarity and merging logic

Documentation Statistics

Total Lines: 7,529 lines of technical documentation
Code Examples: 200+ comprehensive examples
Diagrams: Multiple ASCII diagrams for visual understanding
File Size: 223KB total (101KB + 122KB)

Key Benefits

For Developers: Complete understanding of messaging infrastructure and memory processing
For Operators: Production deployment patterns, monitoring, and troubleshooting
For Architects: System design decisions, trade-offs, and optimization strategies
For Contributors: Clear documentation of implementation details

Test Plan

Documentation files created and properly formatted
All markdown linting checks passed
Cross-references verified and working
Code examples validated against actual implementation
Technical accuracy verified by reviewing source code
docs/README.md updated with new links and use cases

Related Documentation

This PR complements PR #40 (Embeddings and Vector Search) and PR #39 (Comprehensive Documentation) to complete the technical documentation suite.

Related docs:

docs/README.md - Documentation index (updated)
docs/EMBEDDINGS.md - Embeddings guide (PR docs: Add comprehensive embeddings and vector search documentation #40)
docs/VECTOR_SEARCH.md - Vector search guide (PR docs: Add comprehensive embeddings and vector search documentation #40)
docs/ARCHITECTURE.md - System architecture (PR docs: Add comprehensive documentation for all completed features #39)
docs/API_USAGE.md - API usage guide (PR docs: Add comprehensive documentation for all completed features #39)

🤖 Generated with Claude Code

This commit adds two in-depth technical documentation files that explain the core engines of ContextIQ: the messaging system and the memory lifecycle. Complete guide to the RabbitMQ messaging system including: - RabbitMQ configuration with all environment variables - Core components: RabbitMQClient, MessagePublisher, MessageConsumer - Queue definitions (extraction, consolidation, events, DLQ) - Message flow diagrams showing system interactions - Queue patterns: work queues, event queues, dead letter queues - Publishing and consuming messages with 100+ code examples - RPC pattern with correlation IDs and reply queues - Error handling: retry logic, dead letter queues, message rejection - Reliability features: publisher confirms, persistence, prefetch control - Performance tuning: batch operations, connection pooling, compression - Production deployment: clustering, HA queues, monitoring - Troubleshooting guide with common issues and solutions Complete guide to how memories are created and consolidated including: **The Six Phases**: 1. Session Events Collection - How conversations are captured 2. Memory Extraction - LLM-powered extraction with Anthropic Claude 3. Embedding Generation - OpenAI text-embedding-3-small integration 4. Memory Storage - PostgreSQL and Qdrant vector storage 5. Memory Consolidation - Similarity detection and merging 6. Memory Retrieval - Semantic search and ranking **Key Components**: - MemoryGenerationWorker: Extracts memories from sessions - MemoryGenerationProcessor: Orchestrates extraction pipeline - ExtractionEngine: LLM-powered extraction (Anthropic Claude) - ConsolidationWorker: Merges duplicate memories - ConsolidationProcessor: Orchestrates consolidation pipeline - ConsolidationEngine: Similarity detection and merging **Technical Details**: - Memory categories: preference, fact, goal, habit, relationship, professional, location, temporal - Confidence scoring (0.0-1.0) with filtering - Cosine similarity calculation for consolidation - Merge strategies: highest_confidence, most_recent, longest - Conflict detection for contradictory memories - Confidence boost for merged memories **Documentation Quality**: - End-to-end flow diagrams (ASCII) - Complete data models with examples - 100+ comprehensive code examples - Configuration reference for all settings - Performance considerations and optimization - Monitoring and debugging guides - Production best practices - Comprehensive troubleshooting guide - Added "Technical Deep Dives" section with links to new docs - Added use cases: - "...understand how messaging works" - "...understand how memories are created" Based on implementation from: - shared/messaging/config.py - MessagingSettings - shared/messaging/rabbitmq_client.py - RabbitMQClient - shared/messaging/publisher.py - MessagePublisher - shared/messaging/consumer.py - MessageConsumer - shared/messaging/queues.py - Queue definitions - workers/memory_generation/worker.py - workers/consolidation/worker.py Based on implementation from: - workers/memory_generation/worker.py - MemoryGenerationWorker - workers/memory_generation/processor.py - Pipeline orchestration - shared/extraction/engine.py - ExtractionEngine with LLM - workers/consolidation/worker.py - ConsolidationWorker - workers/consolidation/processor.py - Consolidation pipeline - shared/consolidation/engine.py - Similarity and merging Both documents include: - Comprehensive technical explanations - Real implementation details from codebase - 200+ combined code examples - ASCII diagrams for visual understanding - Complete configuration references - Production-ready best practices - Troubleshooting guides - Cross-references to related documentation Total: 7,529 lines of technical documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Add comprehensive documentation explaining the Processing Layer and its relationships to other system components. The Processing Layer sits between Core Services and Storage layers, providing four critical capabilities: - Extraction Engine: LLM-powered fact extraction with Anthropic Claude - Consolidation Engine: Deduplication and conflict resolution - Embedding Service: Vector generation with OpenAI - Revision Tracker: Provenance and history management Documentation includes: - Architecture position and component relationships - Detailed coverage of each engine with code examples - Data flow patterns through the processing layer - Configuration reference for all components - Performance optimization strategies - Production best practices - Monitoring and troubleshooting guidance Updated docs/README.md to include the new guide in Technical Deep Dives section and added a use case for understanding the processing layer. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

elmorem and others added 2 commits December 11, 2025 22:07

elmorem force-pushed the feat/messaging-and-memory-lifecycle-docs branch from 5811f7a to 8a05355 Compare December 12, 2025 06:07

elmorem merged commit d40c305 into main Dec 12, 2025
5 checks passed

elmorem deleted the feat/messaging-and-memory-lifecycle-docs branch December 12, 2025 06:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Add comprehensive messaging and memory lifecycle documentation#41

docs: Add comprehensive messaging and memory lifecycle documentation#41
elmorem merged 2 commits intomainfrom
feat/messaging-and-memory-lifecycle-docs

elmorem commented Dec 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

elmorem commented Dec 12, 2025

Summary

New Documentation Files

1. docs/MESSAGING.md (3,681 lines, 101KB)

2. docs/MEMORY_LIFECYCLE.md (3,848 lines, 122KB)

3. docs/README.md (updated)

Implementation Coverage

Messaging System Documentation

Memory Lifecycle Documentation

Documentation Statistics

Key Benefits

Test Plan

Related Documentation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant