Skip to content

Latest commit

 

History

History

README.md

SciPix - Rust OCR Engine for Scientific Documents & Math Equations

Crates.io Documentation Downloads License: MIT Rust CI

🔬 Production-ready Rust OCR library for extracting LaTeX, MathML, and text from scientific images

Convert mathematical equations, scientific papers, and technical diagrams to structured text with GPU-accelerated inference

Installation | Quick Start | SDK Usage | CLI Reference | Tutorials | API Reference


Why SciPix?

SciPix is a blazing-fast, memory-safe OCR (Optical Character Recognition) engine written in pure Rust. Unlike traditional OCR tools, SciPix is purpose-built for scientific documents, mathematical equations, and technical diagrams — making it the ideal choice for researchers, academics, and developers working with STEM content.

Use Cases

  • 📄 Academic Paper Digitization - Extract text and equations from scanned research papers
  • 🧮 Math Homework Assistance - Convert handwritten equations to LaTeX for AI tutoring apps
  • 📊 Technical Documentation - Process engineering diagrams and scientific charts
  • 🔬 Research Data Extraction - Batch process journal articles and extract structured data
  • 🤖 AI/LLM Integration - Feed scientific content to language models via MCP protocol

Key Features

Feature Description
🚀 ONNX Runtime GPU-accelerated neural network inference with CUDA, TensorRT, and CoreML support
📐 LaTeX Output Accurate mathematical equation recognition with LaTeX, MathML, and AsciiMath export
SIMD Optimized 4x faster image preprocessing with AVX2, SSE4, and NEON vectorization
🌐 REST API Production-ready HTTP server with rate limiting, caching, and authentication
💻 CLI Tool Batch processing, PDF conversion, and watch mode for continuous OCR
🦀 Pure Rust SDK Type-safe, async/await native library with zero-copy image processing
🔌 WebAssembly Run OCR directly in browsers with full WASM support
🤖 MCP Server Integrate with Claude, ChatGPT, and other AI assistants via Model Context Protocol
📦 Cross-Platform Linux, macOS, Windows, and ARM64 support out of the box

Performance Benchmarks

Operation SciPix Tesseract Mathpix
Simple Text OCR 50ms 120ms 200ms*
Math Equation 80ms N/A 150ms*
Batch (100 images) 2.1s 8.5s N/A
Memory Usage 45MB 180MB Cloud

*API latency, not processing time


Installation

From crates.io (Rust SDK)

cargo add ruvector-scipix

Or add to your Cargo.toml:

[dependencies]
ruvector-scipix = "0.1.16"

# With specific features
ruvector-scipix = { version = "0.1.16", features = ["ocr", "math", "optimize"] }

From Source (CLI & Server)

# Clone the repository
git clone https://github.com/ruvnet/ruvector.git
cd ruvector/examples/scipix

# Build CLI and Server
cargo build --release

# Install globally (optional)
cargo install --path .

Pre-built Binaries

# Download latest release (Linux)
curl -L https://github.com/ruvnet/ruvector/releases/latest/download/scipix-cli-linux-x64 -o scipix-cli
chmod +x scipix-cli

# Download latest release (macOS)
curl -L https://github.com/ruvnet/ruvector/releases/latest/download/scipix-cli-darwin-arm64 -o scipix-cli
chmod +x scipix-cli

Feature Flags

Flag Description Default
default preprocess, cache, optimize
ocr ONNX-based OCR engine
math Math expression parsing
preprocess Image preprocessing
cache Result caching
optimize SIMD & parallel optimizations
wasm WebAssembly support

Quick Start

30-Second Setup

# Build and run the server
cd examples/scipix
cargo run --release --bin scipix-server

# In another terminal, test the API
curl http://localhost:3000/health
# {"status":"healthy","version":"0.1.16"}

Process Your First Image

# Encode an image to base64
BASE64_IMAGE=$(base64 -w 0 equation.png)

# Send OCR request
curl -X POST http://localhost:3000/v3/text \
  -H "Content-Type: application/json" \
  -H "app_id: demo" \
  -H "app_key: demo_key" \
  -d "{\"base64\": \"$BASE64_IMAGE\", \"metadata\": {\"formats\": [\"text\", \"latex\"]}}"

SDK Usage

Basic Usage

use ruvector_scipix::{Config, Result};

fn main() -> Result<()> {
    // Load default configuration
    let config = Config::default();

    // Validate configuration
    config.validate()?;

    println!("SciPix version: {}", ruvector_scipix::VERSION);
    Ok(())
}

Image Preprocessing

use ruvector_scipix::preprocess::{PreprocessPipeline, transforms};
use image::open;

fn preprocess_image(path: &str) -> Result<(), Box<dyn std::error::Error>> {
    // Load image
    let img = open(path)?;

    // Create preprocessing pipeline
    let pipeline = PreprocessPipeline::new()
        .with_auto_rotate(true)
        .with_auto_deskew(true)
        .with_noise_reduction(true)
        .with_contrast_enhancement(true);

    // Process image
    let processed = pipeline.process(img)?;

    // Save result
    processed.save("processed.png")?;

    Ok(())
}

OCR Engine (requires ocr feature)

use ruvector_scipix::ocr::{OcrEngine, OcrOptions};
use ruvector_scipix::OcrConfig;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize OCR engine
    let config = OcrConfig::default();
    let engine = OcrEngine::new(config).await?;

    // Load and process image
    let image = image::open("equation.png")?;
    let result = engine.recognize(&image).await?;

    println!("Text: {}", result.text);
    println!("Confidence: {:.2}%", result.confidence * 100.0);

    // Get LaTeX output
    if let Some(latex) = result.latex {
        println!("LaTeX: {}", latex);
    }

    Ok(())
}

Math Parsing (requires math feature)

use ruvector_scipix::math::{parse_expression, to_latex, to_mathml};

fn parse_math() -> Result<(), Box<dyn std::error::Error>> {
    // Parse a mathematical expression
    let expr = parse_expression("x^2 + 2x + 1")?;

    // Convert to different formats
    let latex = to_latex(&expr)?;
    let mathml = to_mathml(&expr)?;

    println!("LaTeX: {}", latex);
    println!("MathML: {}", mathml);

    Ok(())
}

Caching Results

use ruvector_scipix::cache::CacheManager;
use ruvector_scipix::CacheConfig;

fn use_cache() -> Result<(), Box<dyn std::error::Error>> {
    let config = CacheConfig {
        max_size: 1000,
        ttl_seconds: 3600,
        ..Default::default()
    };

    let cache = CacheManager::new(config)?;

    // Store result
    cache.store("image_hash_123", &result)?;

    // Retrieve result
    if let Some(cached) = cache.get("image_hash_123")? {
        println!("Cache hit: {}", cached.latex);
    }

    Ok(())
}

Configuration Presets

use ruvector_scipix::{default_config, high_accuracy_config, high_speed_config};

fn configure() {
    // Default balanced configuration
    let config = default_config();

    // High accuracy (slower, more precise)
    let accurate = high_accuracy_config();

    // High speed (faster, may sacrifice accuracy)
    let fast = high_speed_config();
}

CLI Reference

Installation

# Install from source
cargo install --path examples/scipix

# Or use pre-built binary
./scipix-cli --help

Commands

ocr - Process Single Image

# Basic OCR
scipix-cli ocr --input document.png

# With output file and format
scipix-cli ocr --input equation.png --output result.json --format latex

# Specify output formats
scipix-cli ocr --input image.png --formats text,latex,mathml

Options:

Flag Description Default
-i, --input Input image path Required
-o, --output Output file path stdout
-f, --format Output format (json, text, latex) json
--formats OCR formats (text, latex, mathml, html) text
--confidence Minimum confidence threshold 0.5

batch - Process Multiple Images

# Process directory
scipix-cli batch --input-dir ./images --output-dir ./results

# With parallel processing
scipix-cli batch -i ./images -o ./results --parallel 8

# Recursive with specific formats
scipix-cli batch -i ./docs -o ./output --recursive --format latex

# Watch mode for continuous processing
scipix-cli batch -i ./inbox -o ./processed --watch

Options:

Flag Description Default
-i, --input-dir Input directory Required
-o, --output-dir Output directory Required
-p, --parallel Parallel workers CPU cores
-r, --recursive Process subdirectories false
--watch Watch for new files false
--max-retries Retry failed files 3

serve - Start API Server

# Start with defaults
scipix-cli serve

# Custom address and port
scipix-cli serve --address 0.0.0.0 --port 8080

# With configuration file
scipix-cli serve --config ./config.toml

# Enable debug logging
RUST_LOG=debug scipix-cli serve

Options:

Flag Description Default
-a, --address Bind address 127.0.0.1
-p, --port Port number 3000
-c, --config Config file path None
--workers Worker threads CPU cores

config - Manage Configuration

# Show current configuration
scipix-cli config show

# Initialize default config file
scipix-cli config init

# Set specific values
scipix-cli config set ocr.confidence_threshold 0.8
scipix-cli config set server.port 8080

# Validate configuration
scipix-cli config validate

doctor - Environment Check

# Run full diagnostics
scipix-cli doctor

# Check specific components
scipix-cli doctor --check cpu,memory,deps

# Output as JSON
scipix-cli doctor --format json

# Auto-fix issues
scipix-cli doctor --fix

Checks performed:

  • CPU cores and SIMD capabilities (SSE2, AVX, AVX2, AVX-512, NEON)
  • Memory availability
  • ONNX Runtime installation
  • Model file availability
  • Configuration validity
  • Network port availability

mcp - MCP Server Mode

# Start MCP server for AI integration
scipix-cli mcp

# With debug logging
scipix-cli mcp --debug

# With custom models directory
scipix-cli mcp --models-dir ./custom-models

Available MCP Tools:

Tool Description
ocr_image Process image file with OCR
ocr_base64 Process base64-encoded image
batch_ocr Batch process multiple images
preprocess_image Apply image preprocessing
latex_to_mathml Convert LaTeX to MathML
benchmark_performance Run performance benchmarks

Claude Code Integration:

claude mcp add scipix -- scipix-cli mcp

Tutorials

Tutorial 1: Basic Image OCR

Learn to extract text from images using the REST API.

# Step 1: Start the server
cargo run --bin scipix-server

# Step 2: Encode your image
BASE64=$(base64 -w 0 document.png)

# Step 3: Send OCR request
curl -X POST http://localhost:3000/v3/text \
  -H "Content-Type: application/json" \
  -H "app_id: test" \
  -H "app_key: test123" \
  -d "{\"base64\": \"$BASE64\", \"metadata\": {\"formats\": [\"text\"]}}"

Tutorial 2: Mathematical Equation Recognition

Convert math images to LaTeX format.

curl -X POST http://localhost:3000/v3/text \
  -H "Content-Type: application/json" \
  -H "app_id: test" \
  -H "app_key: test123" \
  -d '{
    "url": "https://example.com/equation.png",
    "metadata": {
      "formats": ["latex", "mathml"],
      "math_mode": true
    }
  }'

Response:

{
  "latex": "\\frac{-b \\pm \\sqrt{b^2 - 4ac}}{2a}",
  "mathml": "<math>...</math>",
  "confidence": 0.92
}

Tutorial 3: Batch PDF Processing

Process multi-page PDFs asynchronously.

# Submit PDF job
JOB=$(curl -s -X POST http://localhost:3000/v3/pdf \
  -H "Content-Type: application/json" \
  -H "app_id: test" \
  -H "app_key: test123" \
  -d '{
    "url": "https://example.com/paper.pdf",
    "options": {"format": "mmd", "enable_ocr": true}
  }')

JOB_ID=$(echo $JOB | jq -r '.pdf_id')

# Poll for completion
curl http://localhost:3000/v3/pdf/$JOB_ID \
  -H "app_id: test" -H "app_key: test123"

Tutorial 4: CLI Batch Processing

# Process entire directory
scipix-cli batch \
  --input-dir ./documents \
  --output-dir ./results \
  --format latex \
  --parallel 4 \
  --recursive

# Watch mode for continuous processing
scipix-cli batch \
  --input-dir ./inbox \
  --output-dir ./processed \
  --watch

Tutorial 5: WebAssembly Integration

# Build WASM module
cargo install wasm-pack
wasm-pack build --target web --features wasm
<script type="module">
  import init, { ScipixWasm } from './pkg/ruvector_scipix.js';

  async function processImage() {
    await init();
    const scipix = new ScipixWasm();
    await scipix.initialize();

    const canvas = document.getElementById('canvas');
    const imageData = canvas.getContext('2d').getImageData(0, 0, canvas.width, canvas.height);
    const result = await scipix.recognize(imageData.data);
    console.log('Result:', result);
  }

  processImage();
</script>

Tutorial 6: Using as MCP Server

Integrate SciPix with Claude Code or other AI assistants.

# Add to Claude Code
claude mcp add scipix -- scipix-cli mcp

# Or run standalone
scipix-cli mcp --debug

Then use tools in your AI conversations:

  • "Use the ocr_image tool to extract text from ./screenshot.png"
  • "Convert this LaTeX to MathML: \frac{1}{2}"

API Reference

Authentication

All API endpoints (except /health) require authentication:

app_id: your_application_id
app_key: your_secret_key

Endpoints

POST /v3/text - Image OCR

{
  "base64": "...",
  "url": "https://...",
  "metadata": {
    "formats": ["text", "latex", "mathml"],
    "confidence_threshold": 0.5,
    "math_mode": false
  }
}

POST /v3/strokes - Digital Ink

{
  "strokes": [{"x": [0, 10, 20], "y": [0, 10, 0]}],
  "metadata": {"formats": ["latex"]}
}

POST /v3/pdf - PDF Processing

{
  "url": "https://example.com/doc.pdf",
  "options": {
    "format": "mmd",
    "enable_ocr": true,
    "page_range": "1-10"
  }
}

GET /health - Health Check

{"status": "healthy", "version": "0.1.16"}

Configuration

Environment Variables

SERVER_ADDR=127.0.0.1:3000
RUST_LOG=scipix=info
RATE_LIMIT_PER_MINUTE=100
CACHE_MAX_SIZE=1000
MODEL_PATH=./models

Configuration File

[server]
address = "127.0.0.1"
port = 3000
workers = 4

[ocr]
model_path = "./models"
confidence_threshold = 0.5

[cache]
max_size = 1000
ttl_seconds = 3600

[rate_limit]
requests_per_minute = 100
burst_size = 20

Performance

Operation Time (avg) Throughput
SIMD Grayscale 101µs 4.2x faster
SIMD Resize 2.63ms 1.5x faster
Full Pipeline 0.49ms 4.4x faster
Simple text OCR ~50ms 20 img/s
Math equation ~80ms 12 img/s

Troubleshooting

# Check environment
scipix-cli doctor

# Enable debug logging
RUST_LOG=debug scipix-cli serve

# Verify models installed
ls -la models/

Contributing

# Run tests
cargo test --all-features

# Run linting
cargo clippy --all-features

# Format code
cargo fmt

License

MIT License - see LICENSE for details.


Part of the ruvector ecosystem
Built with Rust 🦀 | Powered by ONNX Runtime