Releases: yfedoseev/llmkit
LLMKit v0.1.3
LLMKit v0.1.3
Unified LLM API client with support for 100+ providers and 11,000+ models.
Installation
Rust (crates.io)
[dependencies]
llmkit = { version = "0.1.3", features = ["anthropic", "openai"] }Python (PyPI)
pip install llmkit-pythonNode.js (npm)
npm install llmkit-nodeFeatures
- 100+ LLM providers supported
- 11,000+ models with detailed specs and pricing
- Streaming completions with async iterators
- Tool/function calling
- Extended thinking mode (4 providers)
- Prompt caching
- Structured output (JSON schema)
- Vision/image support
- Audio STT/TTS
- Video generation
- Embeddings API
- Batch processing API
- Token counting API
Changelog
See CHANGELOG.md for details.
v0.1.2 | Quality & Infrastructure
v0.1.2 Release
This release delivers 46 test panic improvements across 30 provider files, comprehensive pre-commit infrastructure for Rust/Python/TypeScript, and enhanced developer tooling with automated code quality enforcement.
🧪 Test Quality Improvements (46 Panics Fixed)
Core Providers (5 files, 10 panics fixed)
src/providers/chat/ollama.rs- 2 panics: Text content and tool use assertion debuggingsrc/providers/chat/anthropic.rs- 2 panics: Simple and structured system content validationsrc/providers/chat/openai.rs- 3 panics: JsonObject and JsonSchema response format debuggingsrc/providers/chat/groq.rs- 1 panic: Tool use content block validationsrc/providers/chat/ai21.rs- 2 panics: Text and tool use content block assertions
Major Providers (18 files, 31 panics fixed)
src/providers/chat/cohere.rs- 2 panicssrc/providers/chat/huggingface.rs- 2 panicssrc/providers/chat/mistral.rs- 2 panicssrc/providers/chat/replicate.rs- 2 panics- Single panic fixes: aleph_alpha, nlp_cloud, yandex, clova, writer, maritaca, watsonx, cerebras, cloudflare, sambanova, databricks, fireworks, openrouter, azure (14 files)
Advanced Providers (2 files, 5 panics fixed)
src/providers/chat/deepseek.rs- 5 panics: Text content and thinking content blockssrc/providers/chat/openai_compatible.rs- 2 panics: Text and tool use validation
Special APIs & Utilities (2 files, 5 panics fixed)
src/providers/specialized/openai_realtime.rs- 3 panics: SessionCreated, Error, RateLimitUpdated eventssrc/streaming_multiplexer.rs- 2 panics: Text delta and chunk reception
Panic Pattern Standardization: All 46 panics converted from if let ... else panic!("message") to match statements with debug output: panic!("Expected X, got {:?}", other) - dramatically improves test failure diagnostics.
🔧 Pre-Commit Infrastructure
13 Automated Code Quality Checks
Rust (3 checks):
cargo fmt- Automatic code formattingcargo clippy --all-targets --all-features- Linting with warnings-as-errorscargo check --all- Compilation verification
Python (3 checks):
black- Code formatting (100-char line width)ruff- Linting with auto-fixmypy --strict- Type checking with strict mode enabled
TypeScript/JavaScript (1 check):
biome- Unified formatter + linter (single quotes, 2-space indent, 100-char width)
General (6 checks):
- Trailing whitespace removal
- End-of-file normalization
- YAML/TOML/JSON validation
- Merge conflict detection
- Line ending normalization (LF)
- Spell checking (codespell with common term exceptions)
Configuration Files:
.pre-commit-config.yaml- 98 lines, 9 hook repositoriesbiome.json- 46 lines, unified TypeScript/JavaScript configuration
📚 Documentation Enhancements
- CONTRIBUTING.md: 237-line enhanced guide with:
- Pre-commit setup instructions (4 sections)
- Per-language command examples
- Troubleshooting guide (6 common issues)
- Updated PR checklist (7 items)
- CHANGELOG.md: Created with comprehensive v0.1.0, v0.1.1, v0.1.2 documentation
✅ Verification
- All Rust tests pass across stable, beta, nightly toolchains
- All Python tests pass (Python 3.9, 3.10, 3.11, 3.12, 3.13, 3.14)
- All TypeScript tests pass (Node.js 18, 20, 22, 24)
- Tests pass on Ubuntu, macOS, and Windows
- Format checks, clippy, and security audits pass
- Code coverage maintained
🚀 Multi-Language Support
- Rust: Pure Rust implementation with 100+ providers
- Trait-based architecture with
Providerenum - Zero-copy optimizations where possible
- Trait-based architecture with
- Python: PyO3-based bindings via Maturin (abi3-py39 for broad compatibility)
- Python 3.9+ with stable ABI support
- Type stubs (.pyi files) for IDE support
- TypeScript/Node.js: NAPI-RS bindings with full type definitions
- Full TypeScript definitions (.d.ts)
- Support for Node.js 18+
📋 Features
- 100+ LLM providers supported
- 11,000+ models with detailed specifications
- Streaming completions with async iterators
- Tool/function calling support
- Extended thinking mode support
- Prompt caching
- Structured output (JSON schema)
- Vision/image input support
- Audio synthesis and processing (STT/TTS)
- Video generation
- Embeddings API
- Batch processing API
- Token counting utilities
- Request multiplexing
- Circuit breaker pattern
- Failover handling
- Health checks and monitoring
- Rate limiting and retry logic
- Smart provider routing
- Multi-tenancy support
📦 Installation
Rust (crates.io):
[dependencies]
llmkit = { version = "0.1.2", features = ["anthropic", "openai"] }Python (PyPI):
pip install llmkit-python==0.1.2Node.js (npm):
npm install llmkit-node@0.1.2🎯 For Contributors
Enable pre-commit hooks:
pip install pre-commit
pre-commit installManually run checks:
pre-commit run --all-filesRelease Date: January 14, 2025
Status: Stable
v0.1.1 | Documentation & Stability
v0.1.1 Release
This release focuses on stability improvements, documentation enhancements, and code cleanup following the initial v0.1.0 launch. 100+ providers, 11,000+ models across Rust, Python, and TypeScript.
🐛 Bug Fixes & Improvements
- Comprehensive documentation fixes and clarifications
- Stability improvements across all language bindings
- Code cleanup and refactoring in core provider implementations
- Enhanced error handling and edge case management
- Improved test coverage for edge cases
📚 Documentation
- Expanded README with detailed provider information
- Improved getting-started guides for Rust, Python, and TypeScript
docs/getting-started-rust.mddocs/getting-started-python.mddocs/getting-started-nodejs.md
- Better API documentation with code examples
- Provider-specific usage guides and feature matrices
- Model registry documentation (MODELS_REGISTRY.md)
🔧 Technical Improvements
- Better error messages and diagnostics across all providers
- Performance optimizations in request handling
- Code quality enhancements and consistency improvements
- Improved test coverage and integration testing
- Optimized dependency versions for all language bindings
🏗️ Architecture Overview
Trait-Based Provider System
- All 100+ providers implement unified
Providertrait - Seamless provider switching without code changes
- Consistent API across all language bindings
Multi-Language Support
- Rust Core: 127 chat providers + audio, image, video, embedding providers
- Python Bindings: PyO3-based Maturin bindings with type hints
- TypeScript Bindings: NAPI-RS bindings with full type definitions
Enterprise Features
- Request multiplexing: Call multiple providers simultaneously
- Circuit breaker pattern: Automatic failure handling
- Failover handling: Automatic provider fallback
- Health checks and monitoring capabilities
- Metering and usage tracking
- Observability/tracing support
- Rate limiting and retry logic
- Smart provider routing
- Multi-tenancy support
✅ Verification
- All Rust tests pass (stable, beta, nightly toolchains)
- All Python tests pass (Python 3.9, 3.10, 3.11, 3.12)
- All TypeScript tests pass (Node.js 18, 20, 22)
- Tests pass on Ubuntu, macOS, and Windows
- Integration tests for major providers pass
- Security audit passes (cargo-deny, etc.)
🚀 Multi-Language Support
- Rust: Pure Rust implementation with 100+ providers
- Zero-copy optimizations where applicable
- Async/await throughout
- No runtime dependencies for core functionality
- Python: PyO3-based bindings via Maturin (abi3 for broad compatibility)
- Native performance with Python convenience
- Type stubs (.pyi files) for IDE support
- Wheel support for Python 3.9-3.12
- TypeScript/Node.js: NAPI-RS bindings with full type definitions
- Full TypeScript definitions (.d.ts)
- Support for Node.js 18+
- Native module bindings for performance
📋 Features
- 100+ LLM providers supported
- Commercial: OpenAI, Anthropic, Google, Meta, Mistral, Amazon Bedrock, Azure OpenAI
- Research: DeepSeek, xAI, 01.AI, Qwen
- Inference platforms: Together, Replicate, Baseten, RunPod, Fireworks
- Specialized: Cohere, AI21, HuggingFace, Groq, Ollama, and 85+ more
- 11,000+ models with detailed specifications
- Pricing information per model
- Performance benchmarks where available
- Provider-specific capabilities mapped
- Streaming completions with async iterators
- Tool/function calling support
- Extended thinking mode support (Claude, DeepSeek, etc.)
- Prompt caching for cost reduction
- Structured output (JSON schema mode)
- Vision/image input support across 50+ providers
- Audio synthesis and processing (STT/TTS)
- Video generation
- Embeddings API with 30+ embedding models
- Batch processing API
- Token counting utilities per provider
- Request multiplexing
- Circuit breaker pattern
- Failover handling
- Health checks and monitoring
- Rate limiting and retry logic
- Smart provider routing
- Multi-tenancy support
📦 Installation
Rust (crates.io):
[dependencies]
llmkit = { version = "0.1.1", features = ["anthropic", "openai"] }Python (PyPI):
pip install llmkit-python==0.1.1Node.js (npm):
npm install llmkit-node@0.1.1Release Date: January 13, 2025
Status: Stable
v0.1.0 | Initial Release
v0.1.0 Release - Initial Launch
LLMKit is a production-grade, unified API client for 100+ LLM providers and 11,000+ models. Write once, deploy to any provider - no provider-specific code required.
🎯 Project Vision
LLMKit eliminates provider lock-in by providing a single, consistent API across all major LLM providers (OpenAI, Anthropic, Google, Mistral, Meta, and 95+ others). Switch providers with a single parameter change. Build for the future without rewriting code.
🏗️ Architecture Highlights
Unified Provider Trait System
- All 100+ providers implement common
Providerinterface - Zero switching cost: same code works across all providers
- Extensible design for new providers
Enterprise-Grade Features
- Request multiplexing: Call multiple providers simultaneously for comparison
- Circuit breaker pattern: Automatic failure detection and recovery
- Failover handling: Automatic provider fallback on errors
- Health checks and monitoring
- Metering and usage tracking
- Observability/tracing support
- Rate limiting and retry logic
- Smart provider routing
- Multi-tenancy support
Three-Language Support
- Rust: Pure Rust core, 127 chat providers + audio/image/video/embedding providers
- Python: PyO3-based bindings via Maturin with type hints (stable ABI for broad compatibility)
- TypeScript/Node.js: NAPI-RS bindings with full TypeScript definitions
🚀 Supported Providers (100+)
Commercial LLM Providers (11)
- OpenAI (GPT-4, GPT-4o, o1 family)
- Anthropic (Claude 3 family, extended thinking)
- Google (Gemini models)
- Meta (Llama via API endpoints)
- Mistral (Mistral Large, Codestral)
- Amazon Bedrock (Claude, Llama, Titan)
- Azure OpenAI
- xAI (Grok)
- 01.AI (Yi models)
- Alibaba Qwen
- Cohere
Research & Innovation (5)
- DeepSeek (r1, v3 with extended thinking)
- Together AI
- VLLM
- Anyscale Endpoints
- Text Generation Inference
Inference Platforms (12)
- Replicate
- Baseten
- RunPod
- Fireworks AI
- Cerebras
- SambaNova
- Databricks
- OpenRouter
- HuggingFace Inference API
- AI21
- Groq
- Cloudflare Workers AI
Local & Self-Hosted (4+)
- Ollama
- LM Studio
- vLLM
- Text Generation Inference
Plus 60+ more specialized providers across voice, audio, embedding, and vision APIs
📊 Model Coverage
- 11,000+ unique models catalogued
- Pricing information for each model
- Performance benchmarks where available
- Provider-specific capabilities mapped
- Token limits and context windows documented
- Input/output modalities tracked
🎬 Comprehensive Feature Set
Text Completion
- Standard completions with streaming
- Async/await support throughout
- Token counting per provider
- Batch processing for efficiency
Advanced Reasoning
- Extended thinking mode (Claude 3.5 Sonnet, DeepSeek, etc.)
- Chain-of-thought prompting
- Structured reasoning outputs
Tool/Function Calling
- Native support across 50+ providers
- Automatic tool selection and execution
- Parallel tool calling where supported
- Tool result parsing and handling
Multimodal Input
- Vision: Image understanding across 50+ providers
- PNG, JPEG, WebP, GIF support
- Base64 encoding handled automatically
- Image URL and file upload support
- Audio: Speech processing with STT/TTS
- Multiple audio format support
- Language detection
- Speaker identification where available
Multimodal Output
- Image Generation: DALL-E, Midjourney, Stable Diffusion via APIs
- Video Generation: Multiple provider support
- Audio Generation: TTS with multiple voices and languages
Structured Data
- JSON Schema mode across providers
- Type-safe response parsing
- Validation and error handling
- Type hint generation from schemas
Embeddings
- 30+ embedding models
- Batch embedding support
- Dimensionality tracked
- Cost calculations per provider
Advanced APIs
- OpenAI Realtime: Low-latency voice conversations
- Anthropic Vision: Advanced image understanding
- Specialized provider APIs: Custom protocol support
🏆 Code Quality
Testing
- Comprehensive test suite across all providers
- Integration tests for major features
- Mock tests for reliability without API calls
- Platform-specific tests (Windows, macOS, Linux)
Safety
- No
unwrap()in production code - Proper error propagation with context
- Safe error handling throughout
- Audit-ready code structure
Performance
- Async/await throughout
- Zero-copy optimizations where possible
- Connection pooling
- Request batching support
- Lazy initialization of providers
📚 Documentation
Getting Started Guides
- Rust:
docs/getting-started-rust.md - Python:
docs/getting-started-python.md - TypeScript/Node.js:
docs/getting-started-nodejs.md
19 Working Examples Per Language
- Simple completion
- Streaming completions
- Tool/function calling
- Vision/image input
- Structured output (JSON schema)
- Embeddings generation
- Audio synthesis (TTS)
- Audio input (STT)
- Extended thinking (where supported)
- Prompt caching
- OpenAI Realtime streaming
- Multiple provider comparison
- Error handling and retry
- Rate limiting and metering
- Batch processing
- Token counting
- Model discovery
- Provider capabilities
- OpenAI-compatible API endpoints
Comprehensive API Documentation
- Provider feature matrices
- Model specifications and pricing
- Capability maps for each provider
- Token limit information
- Input/output modality support
✅ Verification
Test Coverage
- All Rust tests pass (stable, beta, nightly)
- All Python tests pass (Python 3.9+)
- All TypeScript tests pass (Node.js 18+)
- Platform coverage: Ubuntu, macOS, Windows
Quality Gates
- Code passes clippy (strict linting)
- Security audit passes (cargo-deny)
- Dependencies are vetted
- Format checks pass
📦 Installation
Rust (crates.io):
[dependencies]
llmkit = { version = "0.1.0", features = ["anthropic", "openai"] }Python (PyPI):
pip install llmkit-python==0.1.0Node.js (npm):
npm install llmkit-node@0.1.0🎓 Quick Start
Rust Example:
let client = LlmClient::builder()
.with_provider(Provider::OpenAI)
.build()?;
let response = client.complete(request).await?;Python Example:
client = LlmClient()
response = client.complete(
provider="openai",
messages=[{"role": "user", "content": "Hello!"}]
)TypeScript Example:
const client = new LlmClient();
const response = await client.complete({
provider: "openai",
messages: [{ role: "user", content: "Hello!" }]
});🔮 Future Roadmap
- Additional specialized provider integrations
- Enhanced caching strategies
- Performance optimizations
- Extended monitoring capabilities
- Community-contributed providers
- WebAssembly support for browsers
📄 License
Dual-licensed under MIT OR Apache-2.0 for maximum flexibility.
Release Date: January 12, 2025
Status: Initial Release - Production Ready
LLMKit enables building AI applications that are provider-agnostic, future-proof, and incredibly flexible. Start with your preferred provider, switch anytime without rewriting your application.