Background
The current `tycoon ai` stack is hard-wired to LM Studio's local server at `localhost:1234`. LM Studio is a heavy desktop app — it's fine for development on a high-end machine but is not a realistic dependency for general-purpose users.
Research findings
- OpenAI Codex CLI — cloud-only, conflicts with local-first design. Not worth pursuing.
- OpenAI gpt-oss-20b — Apache 2.0 open-weight model, 16GB RAM, available on Ollama. Viable locally, but adds Ollama as a dependency.
- Ollama — lighter than LM Studio, broader hardware support, CLI-first, no GUI. But still a separate install.
Goal
Find the single shortest path to running a capable model for the specific tasks tycoon needs:
- `TestFixer` — fix a failing dbt test
- `ColumnDocumenter` — generate schema.yml descriptions
- `StagingImprover` — refactor a staging model
These are focused, single-turn tasks with small context windows. They do not need a general-purpose chat model or a large context.
Design questions to answer
- What is the lightest runtime that can serve a small model locally without a GUI dependency? (Ollama, llama.cpp server, mlx, transformers pipeline?)
- Is there a model small enough to run on CPU-only hardware that is still useful for these tasks? (e.g. Qwen2.5-Coder-1.5B, Phi-3-mini)
- Should `tycoon ai` ship with a recommended model + one-line install command, rather than requiring the user to set up LM Studio separately?
- Can we reduce `tycoon.ai.client` to a single, minimal HTTP call with no backend abstraction layer?
Out of scope
- Multiple LLM backend options / provider abstraction layer
- Cloud API fallbacks
- OpenAI Codex CLI integration
Background
The current `tycoon ai` stack is hard-wired to LM Studio's local server at `localhost:1234`. LM Studio is a heavy desktop app — it's fine for development on a high-end machine but is not a realistic dependency for general-purpose users.
Research findings
Goal
Find the single shortest path to running a capable model for the specific tasks tycoon needs:
These are focused, single-turn tasks with small context windows. They do not need a general-purpose chat model or a large context.
Design questions to answer
Out of scope