Your codebase already has an architect. It just needs tools.
Lean AI is an agentic coding assistant that reads your project, plans changes, and executes them — all inside your editor. Give it a task in plain English, review the plan, and watch it work.
Run it fully local with Ollama, or connect to OpenAI and Anthropic when you need heavier reasoning. Switch between models mid-session from the UI. No cloud account required to get started.
- Plan first, then execute — a 6-phase planning pipeline reads your codebase, traces data flow across files, and produces a structured plan before touching any code. You approve (or revise) before anything changes. Thinking and content tokens stream in real-time during every planning phase so you can follow the agent's reasoning as it works.
- Learns from your corrections — every time you reject a plan, every time a test fix works, every TDD dispute, Lean AI extracts a short lesson and offers it to you as an inline chip in the chat ("Remember this?"). Confirmed lessons are read back into future planning so the tool stops repeating the same mistakes. Curated Memory
- Training archive for future fine-tuning — every workflow decision is captured locally in an append-only archive behind a fail-closed secrets scrubber. Opt in to export via a bearer-token API when you're ready to ship a LoRA adapter for your expert model. Training Pipeline
- Multi-provider flexibility — Ollama for free local inference, OpenAI for GPT-4o, Anthropic for Claude. Switch from the dropdown without restarting. Use cheap local models for small fixes, cloud models for hard problems.
- Dual-model pipeline — run a fast local model for codebase exploration and code execution, then automatically hand off to a cloud model (Claude, GPT-4o) for reasoning-heavy planning phases and complex fix attempts. Save cloud tokens for the decisions that matter.
- Local Refiner — when using cloud providers, a local Ollama model pre-processes your prompts: enriches them with private reference library context, strips sensitive data, and structures vague requests into detailed specs. Your proprietary docs never leave your machine. Learn more
- Zero prompt engineering — chat mode helps you refine ideas into detailed tasks. Project context and framework guides teach the LLM your codebase conventions automatically.
- Reference library — drop your internal docs (PDF, EPUB, Word, Markdown) into
.lean_ai/reference/and the agent uses them for better plans without leaking content to cloud APIs. - UI verification — the agent can screenshot web pages or desktop GUIs (Tkinter, Qt, Swing, JavaFX, Electron) and analyse layout with a local vision model. Useful for planning UI changes, discussing layout with the agent, or diagnosing visual regressions that pass a type-check but look broken. Learn more
- Built-in code quality — after every execution, Lean AI runs your project's linter and tests automatically. Failures are fed back to the LLM for self-correction. Lint, test, and format commands are auto-detected from your project files — zero configuration needed. When a test command is available, the agent writes tests alongside code changes.
- Git-native workflow — every task runs on its own branch. Approve to merge, reject to discard. Your main branch stays clean.
- 19 scaffold recipes — bootstrap new projects (FastAPI, Next.js, Laravel, Rails, and more) with a single command.
VS Code / VSCodium: Install from the VS Code Marketplace or OpenVSX.
JetBrains IDEs (IntelliJ, PyCharm, WebStorm, CLion, etc.): See the JetBrains Plugin guide for build and install instructions.
On first activation, the extension automatically creates a Python virtual environment and installs the backend server — no manual setup required.
Download Ollama and pull a model:
ollama pull qwen3-coder:30bCloud models: Ollama also supports cloud-hosted models. Pull one with
ollama pull model-name:cloud, then runollama run model-name:cloudand follow the link to connect your Ollama Cloud account. Once linked, select the cloud model from the Lean AI dropdown — it works the same as a local model.
Type /init in the chat panel to index your workspace and generate project context. Then describe what you want built.
If you prefer to build from source instead of installing from the marketplace:
cd backend
pip install -e ".[dev]"Need cloud providers or reference library support? See optional extras.
Start the server manually:
uvicorn lean_ai.main:app --reload --port 8422cd extension && npm install && npm run build
npx vsce package --no-dependenciesInstall the generated .vsix file: Extensions sidebar > ... menu > Install from VSIX...
When building from source, set lean-ai.backendDir or lean-ai.pythonPath in the extension settings — the automatic installer is skipped when either setting is explicitly configured.
You: "Add user authentication with JWT tokens"
|
[Chat refines the idea]
|
[6-phase planning pipeline]
scope -> files -> design -> risks -> plan
|
[You review and approve]
|
[Agent executes step-by-step]
creates files, edits code, writes tests
|
[Post-execution validation]
auto-format -> lint fix -> lint check -> test
failures fed back to LLM for self-correction
|
[Changes committed on a branch]
/approve to merge, /reject to discard
Two modes:
/agent— full planning pipeline for features and refactors/fix— skip planning, let the agent explore and fix directly
See Architecture for the full breakdown.
| Command | What it does |
|---|---|
/init |
Index workspace and generate project context |
/agent |
Send a task to the planning pipeline |
/fix |
Skip planning, fix directly with full tool access |
/approve |
Merge the agent's branch |
/reject |
Discard the agent's branch |
/style |
Generate style guide from CSS/templates |
/skill <name> <task> |
Load a local skill from .lean_ai/skills/<name>/instructions.md and apply it to the task |
/request <task> |
Skip planning, open-ended task with full tool access |
/resume [session_id] |
Resume a previous session |
/help |
Show this help |
/interview-prep |
Convert a .docx resume and tailor it for a specific role |
/batch-prep |
Tailor resumes + cover letters for many roles in one run |
/ats-check [slug] |
Keyword gap report comparing resume to the job description |
/thank-you [slug] |
Draft a post-interview thank-you note |
/recruiter-reply |
Draft a reply to a recruiter's cold outreach |
/negotiate [slug] |
Research market comp and build a negotiation brief |
/analyse-rejection [slug] |
Post-mortem a rejection with concrete takeaways |
/log-applied [slug] |
Append a tracker row and commit the application folder to git |
/mock-interview [slug] |
Interactive Q&A practice with rubric scoring |
/note |
Save a note (auto-categorized by project) |
/scaffold |
Bootstrap a new project |
/memories |
Manually trigger memory extraction from the last completed workflow session |
/reboot |
Restart the backend server |
Create a backend/config.yaml file (or use the extension's settings panel):
# Provider: ollama, openai, or anthropic
llm_provider: ollama
# Local model (default)
ollama_model: "qwen3-coder:30b"
ollama_context_window: 128 # 128k — shorthand for 131072
# Cloud API keys — encrypt with: python -m lean_ai encrypt-key <key>
openai_api_key: "enc:gAAAAABf..."
anthropic_api_key: "enc:gAAAAABf..."Environment variables (LEAN_AI_*) and legacy .env files also work. See configuration priority.
Use a local model for the bulk of the work and a cloud model only for planning and complex fixes:
llm_provider: ollama
ollama_model: "qwen3-coder:30b"
# Expert: cloud model for planning phases 3-5 and final fix retry
expert_llm_provider: anthropic
anthropic_api_key: "enc:gAAAAABf..."
anthropic_expert_model: "claude-opus-4-6"The expert model only runs for planning phases 3–5 (design + risk synthesis, plan assembly, verification) and the final validation fix retry. All codebase exploration, implementation, and routine tool calls use the primary local model.
Note: the cloud provider's Python SDK must be installed even when the primary provider is Ollama — run
pip install -e ".[dev,anthropic]"orpip install -e ".[dev,openai]"as appropriate, then restart the server.
Use three different local models matched to each role:
ollama_model: "qwen3-coder:30b"
# Expert: larger model for planning and complex reasoning
ollama_model_expert: "qwen3-coder-next:80b"
# Request: smaller model for chat conversation and prompt building
request_llm_provider: openai
openai_base_url: "http://localhost:11434/v1"
openai_request_model: "gpt-oss:20b"See the full configuration reference for all options.
| Guide | Description |
|---|---|
| Configuration | All environment variables, extension settings, and model setup |
| Architecture | Planning pipeline, workflow modes, tools, and internals |
| Example Flow | End-to-end walkthrough of a real session with every guardrail called out |
| Reference Library & Refiner | Private docs, RAG enrichment, and cloud privacy |
| UI Verification | Vision-backed screenshot analysis for web pages and desktop GUIs |
| API Reference | REST endpoints and WebSocket protocol |
| Extension Guide | VSCode/VSCodium setup, commands, and settings |
| JetBrains Plugin | IntelliJ, PyCharm, WebStorm, CLion, and all JetBrains IDEs |
| Prompt Customization | Customizing LLM prompts per project |
| Skills Guide | How /skill works and how to author high-quality skills |
| Modelfile Guide | Customizing Ollama models with persistent rules |
| llama-server Guide | Using llama.cpp as an alternative to Ollama |
| Regulated Environments | Self-hosted deployment for HIPAA, SOX, GDPR, ITAR, and air-gapped networks |
- Python 3.10+
- Node.js 18+ (for the extension)
- At least one LLM provider:
- Ollama with a capable model (e.g.,
qwen3-coder:30b) — free, local, no account needed - OpenAI API key (GPT-4o, etc.)
- Anthropic API key (Claude, etc.)
- Ollama with a capable model (e.g.,
Ollama is always required for inline predictions and embeddings, even when using cloud providers.
| Layer | Technology |
|---|---|
| Backend | Python, FastAPI, aiosqlite |
| LLM providers | Ollama, OpenAI, Anthropic |
| Code analysis | tree-sitter (13 languages) |
| Search | Whoosh BM25F + embedding RRF |
| VS Code Extension | TypeScript, VSCode API |
| JetBrains Plugin | Kotlin, IntelliJ Platform SDK, JCEF |
MIT
