This directory contains language-oriented helper services that can plug into the broader OntoBot action pipelines. They provide:
- Deterministic (or model‑based) NL → SPARQL translation for ontology + telemetry joins.
- Local summarization / natural language post‑processing of analytics JSON and SPARQL results.
All runtime components in OntoBot are considered part of a flexible graph of pipelines; none is singled out as inherently “more core” than others. The Action Server can call these services when their corresponding features (translation or summarization) are desired.
Remote deployment note: Both nl2sparql and ollama can be hosted on separate machines or clusters. Simply set NL2SPARQL_URL and SUMMARIZATION_URL (e.g. to http://translator.host:6005/nl2sparql or https://llm.gateway/api) in the Action Server environment. If reachable over HTTP and following the expected request/response contract, the local containers for these services are not required.
- Purpose: Convert natural language questions into SPARQL queries.
- Service:
nl2sparqlon port 6005 (compose + extras overlay). - Health:
GET /health - Model:
Transformers/t5_base/trained/checkpoint-3(Download the files from Huggingface suhasdevmane/nl2sparql-brick-t5-base or search repository suhasdevmane/nl2sparql-brick on Huggingface) - Volume: Mount
Transformers/t5_base/trained/checkpoint-3read‑only into container (see compose) - ENV:
MODEL_PATH=/app/checkpoint-3
If you omit this service, ensure the Action Server guards translation calls and either:
- Falls back to a rule/template set of SPARQL queries, or
- Routes the user toward intents that do not require dynamic SPARQL generation.
- Purpose: Summarization of SPARQL / analytics outputs, natural language refinement, and optional explanatory responses.
- Service:
ollamaon port 11434 - Health: root shows model list; use
ollama psinside container for process status - GPU: Supported (comment out GPU section if running on CPU only)
- ENV recommendations:
AUTO_PULL_MODELS=mistral:latestWARMUP_MODELS=trueto pre‑generate tokens (reduces first‑response latency)
If disabled, set a flag (e.g., DISABLE_SUMMARIZATION=true) and ensure actions.py skips summarization stages gracefully (display raw metrics or structured JSON instead).
- Swap checkpoints: adjust the mounted directory +
MODEL_PATHenv. - Multiple translators: add additional services (e.g.,
nl2sparql_large) and implement a switching rule in the Action Server. - Additional LLM runtimes: you can run a second
ollamabased model or integrate vLLM; keep the summarization contract stable. - Healthchecks: Always add a
/healthendpoint for orchestrator stability.
If you run without one or both services:
| Skipped | Impact | Mitigation |
|---|---|---|
| nl2sparql | No automated SPARQL generation | Provide template SPARQL or restrict intents |
| ollama | No NL summaries; raw JSON/stat numbers shown | Deterministic text assembly in Python |
Document intentional feature toggles in the main README.md if you maintain a minimal deployment profile.
curl http://localhost:6005/health
curl http://localhost:11434End of file