-
Notifications
You must be signed in to change notification settings - Fork 32
OmegaDesign
This page describes how Omega works under the hood. For the scientific publication, see: Nature Methods (2024). The preprint is available at 10.5281/zenodo.10828225.
Browser (Chat UI)
|
| WebSocket (ws://127.0.0.1:9000/chat)
v
NapariChatServer (FastAPI + Uvicorn)
|
| async dialog loop
v
OmegaAgent (LiteMind Agent)
|
| tool calls
v
ToolSet (10 tools)
|
| to_napari_queue / from_napari_queue
v
NapariBridge (@thread_worker)
|
| Qt event loop execution
v
napari Viewer
The napari plugin entry point. A QWidget that provides the settings UI:
- Two model dropdowns (main and coding model) populated from all configured LLM providers
- Creativity level, personality selection, and feature checkboxes
- "Start" button that launches the chat server
- "Code Editor" button that opens the MicroPlugin window
Configuration is persisted via AppConfiguration("omega") which stores settings in ~/.omega/config.yaml.
A FastAPI application served by Uvicorn on a configurable port (default 9000, auto-incremented if busy). It provides:
-
HTTP endpoint (
/) serving the chat web interface (Jinja2 templates) -
WebSocket endpoint (
/chat) for real-time bidirectional communication -
Chat response types:
start,thinking,tool_start,tool_activity,tool_end,final,error
The server runs in a daemon thread. When the user sends a message, the server passes it to the OmegaAgent and streams tool progress and the final response back over WebSocket.
Extends LiteMind's Agent class. Initialized via initialize_omega_agent() which:
- Creates the LLM instances (main model + tool/coding model) via
get_llm() - Assembles the ToolSet with all available tools
- Configures the system prompt with personality and optional didactic mode
- Returns the configured agent
The agent uses LiteMind's built-in conversation management, tool dispatch, and response generation.
The critical thread-safety component. napari runs on the Qt main thread, while LLM processing happens on async/background threads. Direct cross-thread access to Qt objects would crash the application.
The bridge uses a queue-based architecture:
LLM Thread Qt Main Thread
| |
|-- put(function) --> to_napari_queue
| |
| @thread_worker yields
| |
| qt_code_executor(function)
| |
| function(viewer) executes
| |
|<-- result -------- from_napari_queue
-
to_napari_queue(max size 16): LLM thread puts callable functions -
from_napari_queue(max size 16): Qt thread returns results orExceptionGuardon error -
@thread_worker: napari's decorator that safely yields work to the Qt event loop - Timeout: 300 seconds default for napari operations
Global viewer information is protected by a threading.Lock to prevent race conditions.
The LLM abstraction layer uses LiteMind's CombinedApi to provide a unified interface to multiple providers:
- On first use,
get_litemind_api()prompts for API keys for all configured providers (OpenAI, Anthropic, Gemini) -
get_model_list()returns all available models with text generation capability -
get_llm()creates anLLMinstance for a given model name and temperature -
has_model_support_for()checks model capabilities (e.g., vision, web search)
FunctionTool (LiteMind)
└── BaseOmegaTool
└── BaseNapariTool
├── NapariViewerControlTool
├── NapariViewerExecutionTool
├── NapariWidgetMakerTool
├── CellNucleiSegmentationTool
└── ... (other napari tools)
-
BaseOmegaTool: Wraps arun_omega_tool(query)method as a LiteMindFunctionTool -
BaseNapariTool: Adds LLM-based code generation, napari queue communication, and code execution. The workflow is:- Receive request from agent
- Generate Python code via LLM (using the tool-specific prompt template)
- Package code as a callable
delegated_function(viewer) - Send through
to_napari_queuefor Qt-safe execution - Wait for result from
from_napari_queue - Return result (or error with traceback) to agent
Tools are instantiated in _append_all_napari_tools() with a shared tool_context dict containing the LLM instance, napari queues, notebook reference, and verbosity settings. Conditional tools:
-
NapariViewerVisionTool: Only added if
is_vision_available()returns True - ImageDenoisingTool: Excluded on Apple Silicon
-
Built-in web search: Added if the main model supports
ModelFeatures.WebSearchTool
OmegaToolCallbacks notifies the chat server of tool lifecycle events:
-
on_tool_start: Tool begins execution -
on_tool_activity: Tool generates code, runs analysis, etc. -
on_tool_end: Tool completes successfully -
on_tool_error: Tool encounters an exception
These events are streamed to the browser via WebSocket for real-time progress display.
Encrypted API key storage at ~/.omega_api_keys/:
-
Encryption: Fernet (AES-128-CBC) from the
cryptographylibrary - Key derivation: PBKDF2-HMAC-SHA256, 390,000 iterations, 16-byte random salt
-
Storage format: JSON file per provider (
OpenAI.json,Anthropic.json,Gemini.json) containing base64-encoded encrypted key and salt -
Dialog:
APIKeyDialogpresents a first-time setup or returning-user interface depending on whether a key is already stored
AppConfiguration is a singleton (per app name) that stores settings in YAML format:
- Location:
~/.omega/config.yaml - Thread-safe singleton pattern with
threading.Lock - Auto-saves on any modification
- Default values loaded from bundled config and merged with user overrides
Key configuration values:
-
port: Default server port (9000) -
open_browser: Whether to auto-open browser on start -
notebook_path: Where to save Jupyter notebook transcripts - UI checkbox states (persisted across sessions)
- Create a class extending
BaseNapariTool(for napari tools) orBaseOmegaTool(for general tools) - Set
self.nameandself.descriptionfor LLM tool selection - Implement the tool logic (code generation prompt or direct execution)
- Register the tool in
_append_all_napari_tools()inomega_init.py
LLM providers are managed by the LiteMind library. To add a new provider:
- Add the provider's API key name to
api_key.py - Ensure LiteMind supports the provider (or contribute support)
- The
CombinedApiwill automatically discover and expose the provider's models
Add a new entry to the PERSONALITY dictionary in omega_agent/prompts.py.
Getting Started
Usage
Reference