Skip to content

Latest commit

 

History

History
462 lines (376 loc) · 13.6 KB

File metadata and controls

462 lines (376 loc) · 13.6 KB

Multi-Database Effectiveness Testing Architecture

Executive Summary

This document outlines the architectural design for extending VectorSmuggle's effectiveness testing to support multiple vector databases simultaneously. The solution uses a Plugin-Based Multi-Database Orchestrator that leverages existing single-database testing logic while providing a scalable foundation for adding new database adapters.

Current State Analysis

Existing Strengths

Current Limitations

  • Only supports Qdrant, Faiss, and Pinecone
  • Single VECTOR_DB environment variable per test run
  • No orchestration for multi-database testing
  • Missing adapters for ChromaDB, Weaviate, Milvus, pgvector

Proposed Architecture

graph TB
    subgraph "Multi-Database Orchestrator"
        A[generate_multi_db_effectiveness_report.py] --> B[DatabaseOrchestrator]
        B --> C[DatabaseAdapterRegistry]
        B --> D[ResultAggregator]
        B --> E[SetupValidator]
    end
    
    subgraph "Database Adapters (Plugin Architecture)"
        C --> F[QdrantAdapter]
        C --> G[FaissAdapter] 
        C --> H[PineconeAdapter]
        C --> I[ChromaDBAdapter]
        C --> J[WeaviateAdapter]
        C --> K[MilvusAdapter]
        C --> L[PgVectorAdapter]
    end
    
    subgraph "Existing Components (Reused)"
        M[generate_effectiveness_report.py]
        N[config.py]
        O[utils/embedding_factory.py]
    end
    
    subgraph "Manual Setup & Documentation"
        P[Setup Instructions]
        Q[Database Connection Validation]
        R[Troubleshooting Guide]
    end
    
    B --> M
    F --> N
    E --> P
    E --> Q
    
    subgraph "Output & Results"
        D --> S[Multi-DB Report JSON]
        D --> T[Comparative Analysis]
        D --> U[Performance Metrics]
        D --> V[Failure Analysis]
    end
Loading

Core Components

1. DatabaseOrchestrator (Main Controller)

Responsibilities:

  • Coordinate testing across multiple databases
  • Manage test execution flow with graceful error handling
  • Handle database failures and continue testing others
  • Aggregate results from all databases

Key Methods:

class DatabaseOrchestrator:
    def run_multi_database_test(self, databases: List[str]) -> MultiDBResults
    def test_single_database(self, db_name: str) -> SingleDBResult
    def handle_database_failure(self, db_name: str, error: Exception) -> None
    def aggregate_results(self, results: List[SingleDBResult]) -> MultiDBResults

2. DatabaseAdapterRegistry (Plugin System)

Responsibilities:

  • Register and manage database adapters
  • Provide unified interface for database operations
  • Handle adapter discovery and loading
  • Validate adapter compatibility

Plugin Interface:

class DatabaseAdapter(ABC):
    @abstractmethod
    def get_connection_config(self) -> Dict[str, Any]
    
    @abstractmethod
    def test_connection(self) -> bool
    
    @abstractmethod
    def run_effectiveness_test(self, config: Config) -> Dict[str, Any]
    
    @abstractmethod
    def cleanup(self) -> None
    
    @abstractmethod
    def get_setup_instructions(self) -> str

3. SetupValidator (Manual Setup Support)

Responsibilities:

  • Validate database connections before testing
  • Provide clear setup instructions for each database
  • Generate troubleshooting guidance
  • Check Docker container status (informational only)

Features:

  • Connection health checks for each database
  • Clear manual setup instructions
  • Port availability validation
  • Service readiness verification

4. ResultAggregator (Data Collection)

Responsibilities:

  • Collect results from all database tests
  • Generate comparative analysis
  • Create unified reporting format
  • Handle partial results from failed tests

Database Configuration Mapping

Supported Databases (Immediate Implementation)

Database Docker Port Connection String Adapter Status
Qdrant 6334 http://localhost:6334 ✅ Existing
Faiss N/A Local file system ✅ Existing
Pinecone N/A API-based ✅ Existing

New Databases (Plugin Implementation)

Database Docker Port Connection String Adapter Status
ChromaDB 8000 http://localhost:8000 🔧 New Plugin
Weaviate 8080 http://localhost:8080 🔧 New Plugin
Milvus 19530 localhost:19530 🔧 New Plugin
pgvector 5432 postgresql://postgres:mypassword@localhost:5432/vectordb 🔧 New Plugin

Manual Setup Instructions

Docker Container Setup

# Navigate to Docker directory
cd test_vector_dbs_docker/

# Start all vector databases
docker compose up -d

# Verify all services are running
docker compose ps

# Check service health
docker compose logs qdrant
docker compose logs chroma
docker compose logs weaviate
docker compose logs milvus
docker compose logs postgres

Individual Database Setup

Qdrant

# Verify Qdrant is accessible
curl http://localhost:6334/health

# Expected response: {"status":"ok"}

ChromaDB

# Verify ChromaDB is accessible
curl http://localhost:8000/api/v1/heartbeat

# Expected response: {"nanosecond heartbeat": <timestamp>}

Weaviate

# Verify Weaviate is accessible
curl http://localhost:8080/v1/.well-known/ready

# Expected response: {"status": "ok"}

Milvus

# Verify Milvus is accessible
curl http://localhost:9091/health

# Expected response: health check information

pgvector

# Test PostgreSQL connection
psql -h localhost -p 5432 -U postgres -d vectordb -c "SELECT 1;"

# Password: mypassword

Result Structure Design

Multi-Database Result Format

{
  "test_metadata": {
    "timestamp": "2025-06-15T15:30:00Z",
    "version": "3.0",
    "test_type": "multi_database_effectiveness",
    "total_databases": 6,
    "successful_tests": 4,
    "failed_tests": 2,
    "setup_validation": {
      "docker_compose_available": true,
      "containers_running": ["qdrant", "chroma", "weaviate", "postgres"],
      "containers_failed": ["milvus"]
    }
  },
  "database_results": {
    "qdrant": {
      "status": "success",
      "execution_time": 45.2,
      "effectiveness_metrics": {
        "overall_success_rate": 85.0,
        "evasion_success_rate": 78.5,
        "fragmentation_success": true,
        "decoy_generation_success": true
      },
      "connection_info": {
        "url": "http://localhost:6334",
        "collection": "rag-exfil-poc",
        "health_check": "passed"
      }
    },
    "chromadb": {
      "status": "failed",
      "error": "Connection refused - container not running",
      "execution_time": 5.1,
      "connection_info": {
        "url": "http://localhost:8000",
        "health_check": "failed"
      },
      "setup_instructions": "Run: docker compose up -d chroma"
    }
  },
  "comparative_analysis": {
    "performance_ranking": [
      {"database": "faiss", "execution_time": 12.3},
      {"database": "qdrant", "execution_time": 45.2},
      {"database": "pinecone", "execution_time": 67.8}
    ],
    "effectiveness_comparison": {
      "best_evasion_rate": {"database": "qdrant", "rate": 78.5},
      "fastest_execution": {"database": "faiss", "time": 12.3},
      "most_reliable": {"database": "faiss", "success_rate": 100.0}
    },
    "reliability_metrics": {
      "success_rate_by_db": {
        "qdrant": 100.0,
        "faiss": 100.0,
        "pinecone": 100.0,
        "chromadb": 0.0,
        "weaviate": 0.0,
        "milvus": 0.0,
        "pgvector": 0.0
      }
    }
  },
  "recommendations": {
    "best_for_performance": "faiss",
    "best_for_evasion": "qdrant",
    "most_reliable": "faiss",
    "setup_issues": [
      "ChromaDB container not running - see setup instructions",
      "Milvus dependencies not healthy - check etcd and minio"
    ]
  }
}

Error Handling & Resilience Strategy

Failure Categories

  1. Setup/Connection Failures

    • Container not running
    • Network connectivity issues
    • Authentication problems
    • Port conflicts
  2. Test Execution Failures

    • Embedding generation errors
    • Database-specific API issues
    • Resource constraints
    • Timeout errors
  3. Data Integrity Failures

    • Incomplete result sets
    • Corrupted embeddings
    • Metadata inconsistencies

Resilience Mechanisms

graph LR
    A[Database Test] --> B{Setup Valid?}
    B -->|No| C[Log Setup Error & Skip]
    B -->|Yes| D{Connection OK?}
    D -->|No| E[Log Connection Error & Skip]
    D -->|Yes| F[Run Effectiveness Test]
    F --> G{Test Success?}
    G -->|No| H[Retry with Backoff]
    H --> I{Retry Success?}
    I -->|No| J[Mark Failed & Continue]
    I -->|Yes| K[Collect Results]
    G -->|Yes| K
    C --> L[Next Database]
    E --> L
    J --> L
    K --> L
Loading

Error Recovery Features

  • Graceful Degradation: Continue testing other databases if one fails
  • Retry Logic: Configurable retry attempts with exponential backoff
  • Partial Results: Generate reports even with some database failures
  • Detailed Logging: Comprehensive error tracking and setup guidance
  • Setup Validation: Pre-flight checks with clear remediation steps

Implementation Approach

Phase 1: Core Infrastructure (Minimal Changes)

  1. Create Multi-Database Orchestrator

    • New file: generate_multi_db_effectiveness_report.py
    • Wrapper around existing generate_effectiveness_report.py
    • Basic database iteration logic with error handling
  2. Database Adapter Registry

    • New directory: database_adapters/
    • Base adapter interface
    • Adapters for existing databases (Qdrant, Faiss, Pinecone)
  3. Configuration Extensions

    • Extend config.py with multi-database settings
    • Database-specific configuration mapping
    • Setup validation configuration

Phase 2: Plugin System & New Adapters

  1. Plugin Architecture

    • Dynamic adapter loading
    • Adapter validation and registration
    • Plugin discovery mechanism
  2. New Database Adapters

    • ChromaDB adapter with HTTP API integration
    • Weaviate adapter with REST/gRPC support
    • Milvus adapter with gRPC client
    • pgvector adapter with PostgreSQL connection
  3. Setup Validation System

    • Connection health checks
    • Setup instruction generation
    • Troubleshooting guidance

Phase 3: Enhanced Analysis & Reporting

  1. Advanced Result Analysis

    • Performance benchmarking across databases
    • Effectiveness comparison metrics
    • Statistical analysis and recommendations
  2. Improved Documentation

    • Database-specific setup guides
    • Troubleshooting documentation
    • Performance tuning recommendations

File Structure

VectorSmuggle/
├── generate_multi_db_effectiveness_report.py    # New orchestrator
├── database_adapters/                           # New plugin directory
│   ├── __init__.py
│   ├── base_adapter.py                         # Abstract base class
│   ├── qdrant_adapter.py                       # Existing DB adapters
│   ├── faiss_adapter.py
│   ├── pinecone_adapter.py
│   ├── chromadb_adapter.py                     # New DB adapters
│   ├── weaviate_adapter.py
│   ├── milvus_adapter.py
│   └── pgvector_adapter.py
├── multi_db_config.py                          # Extended configuration
├── setup_validator.py                          # Setup validation utilities
└── docs/
    ├── multi_database_setup_guide.md           # Setup instructions
    └── multi_database_troubleshooting.md       # Troubleshooting guide

Benefits of This Architecture

1. Minimal Code Changes

  • Reuses existing generate_effectiveness_report.py logic
  • Preserves current configuration system
  • Maintains backward compatibility

2. Scalable Plugin System

  • Easy to add new database adapters
  • Clean separation of concerns
  • Testable adapter interfaces

3. Robust Error Handling

  • Continues testing despite individual failures
  • Comprehensive error reporting
  • Graceful degradation

4. Clear Manual Setup

  • Detailed setup instructions for each database
  • Connection validation and troubleshooting
  • No complex container management dependencies

5. Rich Result Analysis

  • Comparative effectiveness metrics
  • Performance benchmarking
  • Database recommendation insights

Implementation Priority

Database Implementation Order

  1. Qdrant, Faiss, Pinecone - Leverage existing implementations
  2. ChromaDB - Simple HTTP API, good documentation
  3. pgvector - PostgreSQL-based, familiar SQL interface
  4. Weaviate - Well-documented REST API
  5. Milvus - More complex but powerful, gRPC-based

Development Phases

  1. Core Orchestrator - Multi-database testing framework
  2. Existing Adapters - Wrap current database implementations
  3. New Adapters - Implement remaining database plugins
  4. Enhanced Analysis - Comparative metrics and recommendations

This architecture provides a solid foundation for multi-database effectiveness testing while maintaining the strengths of the existing VectorSmuggle system. The plugin-based approach ensures extensibility, and the focus on manual setup with clear instructions provides reliability without complex dependencies.