Skip to content

Migrate shell workflows to Click-based Python CLI#1

Merged
se02035 merged 21 commits into
devfrom
cursor/click-cli-migration-2489
Apr 20, 2026
Merged

Migrate shell workflows to Click-based Python CLI#1
se02035 merged 21 commits into
devfrom
cursor/click-cli-migration-2489

Conversation

@se02035

@se02035 se02035 commented Apr 20, 2026

Copy link
Copy Markdown
Owner

Summary

Re-verified each inline finding against current code and applied only changes still needed.

Newly applied from this review pass

  • .pre-commit-config.yaml
    • Updated Ruff hook rev to v0.15.10.
    • Added ruff-format hook next to ruff-check.
  • pyproject.toml
    • Pinned runtime/dev dependencies to deterministic versions.
  • .env.example
    • Restored functional defaults for operational vars (OLLAMA host/model values) and reordered keys within logical sections.
  • src/local_llm_proxy/config.py
    • Implemented marker-based repo root discovery with fallback.
    • Kept env value normalization (redacted placeholders + whitespace handling).
  • src/local_llm_proxy/cli.py
    • setup stop now emits explicit Failed to stop proxy: ... ClickException messaging.
  • src/local_llm_proxy/services/proxy.py
    • Upgraded _write_virtual_key to atomic write with restrictive perms and fsync.
  • tests/test_config.py
    • Added coverage for repo root marker discovery.
  • tests/test_validate.py
    • Minor formatting updates from ruff-format.

Findings already satisfied before this pass (no additional changes required)

  • config/litellm-config.yaml model alias is env-driven (os.environ/LITELLM_MODEL_NAME), not hardcoded redacted literal.
  • README.md ngrok prerequisite already marked optional; Cursor virtual-key guidance present.
  • services/process_utils.py already normalized subprocess failures and routed output() via run_command.
  • services/validate.py already had guarded choices parsing and URL-based host normalization.
  • services/proxy.py already wrapped virtual-key seeding request/JSON errors.

Verification

  • pre-commit run --all-files
  • python3 -m pytest -q
  • docker compose --env-file .env.example -f config/docker-compose.yml config -q
Open in Web Open in Cursor 

Co-authored-by: Oliver L <se02035@users.noreply.github.com>
@coderabbitai

coderabbitai Bot commented Apr 20, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

The pull request replaces a shell script-based local LLM proxy setup system with a Python CLI package architecture. A new local-llm-proxy command-line tool provides subcommands for proxy lifecycle management, Ollama model operations, and end-to-end validation. Configuration is centralized under config/ with redacted secrets, and a Python-based test suite replaces shell validation scripts. Legacy shell helpers and workflows are removed.

Changes

Cohort / File(s) Summary
Removed Shell Infrastructure
src/lib.sh, src/load-model.sh, src/proxy.sh, test/test-setup.sh
Deleted legacy Bash scripts including shared logging helpers, model loading, proxy lifecycle management, and end-to-end validation. Functionality migrated to Python CLI and services.
CI/CD Workflow Transition
.github/workflows/local-llm-proxy-lint.yml, .github/workflows/python-cli-quality.yml
Removed shell-focused linting workflow; added new Python-based quality workflow running Ruff, pytest, Docker Compose validation, and yamllint on Python 3.11.
Configuration & Secrets Management
config/.env.example, config/docker-compose.yml, config/litellm-config.yaml, config/.yamllint, .gitignore, src/config/.yamllint
Updated environment defaults to use [REDACTED] placeholders for sensitive values; consolidated YAML lint config to root config/ directory; adjusted .gitignore paths from src/config/ to config/ and added Python artifact patterns.
Python CLI & Core Architecture
src/local_llm_proxy/__init__.py, src/local_llm_proxy/cli.py, src/local_llm_proxy/config.py, src/local_llm_proxy/logging_utils.py
Introduced Python package with Click-based CLI entry point, settings dataclass/loader, and logging utilities for standardized timestamp formatting.
Service Layer Implementation
src/local_llm_proxy/services/__init__.py, src/local_llm_proxy/services/process_utils.py, src/local_llm_proxy/services/proxy.py, src/local_llm_proxy/services/models.py, src/local_llm_proxy/services/validate.py
Added modular services for subprocess execution, Docker Compose proxy lifecycle (start/stop/restart), Ollama model management (add/remove/list), and end-to-end validation with Ollama and LiteLLM connectivity checks.
Test Suite
tests/test_cli.py, tests/test_models.py, tests/test_proxy.py, tests/test_validate.py
Introduced comprehensive pytest coverage for CLI command handling, model operations, proxy startup/shutdown, virtual key seeding, and validation workflow.
Project Configuration & Documentation
pyproject.toml, .pre-commit-config.yaml, .coderabbit.yaml, README.md
Added setuptools-based project metadata with local-llm-proxy console script entry point, dev dependencies (pytest, ruff, pre-commit); configured pre-commit hook for Ruff; updated README from shell-script instructions to Python CLI usage patterns.

Sequence Diagrams

sequenceDiagram
    participant User
    participant CLI as CLI<br/>(cli.py)
    participant Config as Config<br/>(config.py)
    participant Proxy as Proxy Service<br/>(proxy.py)
    participant Docker as Docker Compose
    participant LiteLLM as LiteLLM<br/>Readiness
    participant Ngrok as Ngrok API

    User->>CLI: local-llm-proxy setup start
    CLI->>Config: load_settings()
    Config-->>CLI: Settings object
    CLI->>Proxy: start_proxy(settings)
    Proxy->>Docker: run docker compose up -d
    Docker-->>Proxy: container started
    Proxy->>LiteLLM: poll /health/readiness
    LiteLLM-->>Proxy: HTTP 200
    Proxy->>Proxy: _seed_virtual_key()
    Note over Proxy: Check cache or generate key
    Proxy->>LiteLLM: GET /v1/models
    LiteLLM-->>Proxy: model list
    Proxy->>LiteLLM: POST /key/generate
    LiteLLM-->>Proxy: virtual key
    Proxy->>Ngrok: GET /api/tunnels
    Ngrok-->>Proxy: public_url
    Proxy-->>CLI: {public_url, virtual_key}
    CLI-->>User: Display ngrok URL & credentials
Loading
sequenceDiagram
    participant User
    participant CLI as CLI<br/>(cli.py)
    participant Config as Config<br/>(config.py)
    participant Validate as Validate Service<br/>(validate.py)
    participant Ollama as Ollama API
    participant LiteLLM as LiteLLM API

    User->>CLI: local-llm-proxy validate
    CLI->>Config: load_settings()
    Config-->>CLI: Settings object
    CLI->>Validate: validate_setup(settings)
    Validate->>Validate: _normalize_ping_host()
    Validate->>Ollama: GET /api/tags
    Ollama-->>Validate: tags response (HTTP 200)
    Validate->>LiteLLM: POST /v1/chat/completions
    Note over LiteLLM: Auth: Bearer token<br/>Model: {model}.local
    LiteLLM-->>Validate: JSON response
    Validate->>Validate: Extract choices[0].message.content
    Validate-->>CLI: ValidationResult
    CLI-->>User: Display validation response content
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes


🐰 Hop along, the scripts are gone,
A Python CLI takes the dawn,
With Click commands and services neat,
The proxy setup is now complete!
Secrets redacted, config aligned—
A cleaner way we've left behind! 🚀

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.16% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Migrate shell workflows to Click-based Python CLI' is concise, clear, and directly describes the primary change: replacing shell scripts with a Python CLI interface.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cursor/click-cli-migration-2489

Comment @coderabbitai help to get the list of available commands and usage tips.

cursoragent and others added 3 commits April 20, 2026 08:54
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
@se02035 se02035 marked this pull request as ready for review April 20, 2026 09:46
@se02035

se02035 commented Apr 20, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Apr 20, 2026

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@se02035

se02035 commented Apr 20, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented Apr 20, 2026

Copy link
Copy Markdown
✅ Actions performed

Full review triggered.

@se02035

se02035 commented Apr 20, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai help

@coderabbitai

coderabbitai Bot commented Apr 20, 2026

Copy link
Copy Markdown

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

CodeRabbit commands

These commands are invoked using PR/Issue comments.

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai evaluate custom pre-merge check --instructions <custom-checks-instructions> --name <custom-checks-title> [--mode <error|warning>] to validate the custom pre-merge checks instructions. Defaults to error when --mode is omitted.
  • @coderabbitai ignore pre-merge checks to override pre-merge checks and get an approval on PR.
  • @coderabbitai run pre-merge checks to run pre-merge checks on the pull request.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve merge conflict to automatically resolve merge conflicts.
  • @coderabbitai autofix to automatically fix issues identified in unresolved review comments.
  • @coderabbitai resolve to resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai generate configuration to create a PR that adds the current resolved configuration as .coderabbit.yaml (or show it if already present).
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit configuration file (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, support, documentation and community

  • Visit our status page to check the current availability of CodeRabbit.
  • Create a ticket on our support page for assistance with any issues or questions.
  • Visit our documentation site for detailed information on how to use CodeRabbit.
  • Join our Discord community to connect with other users and get help from the community.
  • Follow us on X/Twitter for updates and announcements.

@se02035

se02035 commented Apr 20, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai please review from scratch the entire change.

@coderabbitai

coderabbitai Bot commented Apr 20, 2026

Copy link
Copy Markdown

@se02035 Sure! I'll kick off a full review from scratch of all the changes in this PR right away.

✅ Actions performed

Full review triggered.

@se02035 se02035 self-assigned this Apr 20, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@config/.env.example`:
- Around line 9-12: The example env file replaces required non-secret runtime
defaults with “[REDACTED]”, making validation/runtime fail; restore sensible,
non-secret defaults for OLLAMA_HOST, OLLAMA_MODEL, and LITELLM_OLLAMA_MODEL in
config/.env.example so the file is functional (e.g., localhost or a generic
model name) and keep secrets redacted elsewhere; update the lines containing
OLLAMA_HOST, OLLAMA_MODEL, and LITELLM_OLLAMA_MODEL to use safe defaults and add
a short comment that these are non-secret operational defaults that should be
overridden in production.

In `@config/.yamllint`:
- Around line 3-7: The global yamllint rules block currently disables core
checks ("indentation", "line-length", "document-start", "truthy"); change it to
enable the essential checks (remove the global "disable" values or set them to
enabled) and instead add targeted per-file or per-pattern overrides for files
that truly need exceptions by using per-file rule entries (pattern keys with
their own "rules" section) so that "indentation" and "line-length" remain
enforced by default while only specific files use "indentation: disable" or
"line-length: disable" as needed.

In `@config/litellm-config.yaml`:
- Line 2: The config hardcodes model_name to "[REDACTED].local" which breaks
runtime lookup used by validate_setup; restore a functional alias that aligns
with the runtime variable OLLAMA_MODEL by changing model_name to use that alias
(e.g., "{OLLAMA_MODEL}.local" or "OLLAMA_MODEL.local") so requests built by
validate_setup and references to OLLAMA_MODEL resolve correctly; update the
model_name entry in config/litellm-config.yaml to that placeholder and ensure
any doc/comment explains it should not be redacted.

In `@local_llm_proxy/cli.py`:
- Around line 35-40: The setup_stop command currently calls load_settings() and
stop_proxy(settings) directly and can let RuntimeError bubble up; wrap the
stop_proxy call in the same try/except pattern used by setup
start/restart/validate so operational failures are converted to
click.ClickException: call load_settings(), then try: stop_proxy(settings)
except RuntimeError as e: raise click.ClickException(str(e)) so the CLI returns
a proper Click error instead of a raw exception.

In `@local_llm_proxy/config.py`:
- Around line 40-43: The config loader is treating placeholder tokens like
"[REDACTED]" as real values; update load_settings() so env-derived strings
(e.g., ollama_model, ollama_host, litellm_port, litellm_master_key) are
normalized by trimming whitespace and treating common redaction placeholders
(e.g., "[REDACTED]", "<REDACTED>", "REDACTED") as unset (convert them to empty
string or None) before returning settings; this ensures downstream code in
_seed_virtual_key and validate_setup sees a missing key rather than a truthy
placeholder and will behave correctly.

In `@local_llm_proxy/services/process_utils.py`:
- Around line 7-35: run_command currently only catches CalledProcessError and
leaves FileNotFoundError and TimeoutExpired to leak raw exceptions, and output
bypasses run_command; update run_command (function run_command) to also catch
FileNotFoundError and subprocess.TimeoutExpired and convert them into
RuntimeError with a clear message (honoring error_prefix when provided), and
refactor output (function output) to call run_command with capture_output=True
and return result.stdout so all subprocess failures are normalized through the
same error handling path.

In `@local_llm_proxy/services/proxy.py`:
- Around line 124-126: The virtual key is written with default permissions which
can be too permissive; in _write_virtual_key ensure the parent directory is
created with restrictive permissions (e.g., mode 0o700) and the file itself is
created or hardened to owner-read/write only (e.g., mode 0o600) after writing
the secret; use Path.mkdir(..., mode=0o700, exist_ok=True) for the directory and
call path.chmod(0o600) (or create the file with os.open and mode 0o600) after
writing to guarantee the key is not world-readable.
- Around line 100-117: Wrap the entire virtual-key seeding block (the
requests.get to "{base}/v1/models", parsing models_response.json(), the
requests.post to "{base}/key/generate", and extracting
key_response.json().get("key") into a try/except that catches
requests.RequestException and JSON/parse errors (ValueError, TypeError,
KeyError) and re-raises a RuntimeError with a clear message that includes the
original exception; use the existing local symbols (models_response,
key_response, virtual_key, settings.litellm_master_key, settings.litellm_port)
so callers still get a RuntimeError boundary instead of raw exceptions.

In `@local_llm_proxy/services/validate.py`:
- Around line 65-68: The code currently assumes payload["choices"][0] exists and
can raise IndexError; change the guard around payload parsing by explicitly
checking that payload.get("choices") is a list with at least one element and
that the first element contains a "message" with "content" before accessing it
(i.e., validate payload.get("choices") is instance of list and len(...)>0 and
payload["choices"][0].get("message") is not None), then set content from that
safe access; if any of those checks fail, raise the same RuntimeError with the
full payload to preserve the original error path (referencing the variables
payload, response.json(), and content in validate.py).
- Around line 17-22: The _normalize_ping_host function currently does broad
substring replacements of "host.docker.internal" and "ollama" anywhere in the
input string, which can corrupt full URLs; change it to parse the input as a URL
(or extract the hostname) and only replace the hostname portion — not the entire
string — so that only hostnames "host.docker.internal" or a hostname equal to
"ollama" are mapped to "localhost"; update _normalize_ping_host to reconstruct
the URL after modifying only the parsed netloc/hostname (or handle bare host
strings by checking equality), preserving schemes, ports, paths and query
strings.

In `@README.md`:
- Line 21: The README currently lists "Ngrok account and NGROK_AUTHTOKEN" as a
required prerequisite but tunneling is described as optional; update the
prerequisites line in README.md to mark ngrok as optional by changing the
wording to something like "Ngrok account and NGROK_AUTHTOKEN (optional — only
required for public tunneling)" so readers know it isn't needed for local-only
usage and preserve the existing phrasing for clarity.
- Line 87: Update the README guidance for Cursor auth to explicitly tell users
to use the virtual key printed by the CLI when running "local-llm-proxy setup
start": after instructing to set "Override OpenAI Base URL" to the ngrok URL
with "/cursor" appended, add a sentence that the CLI outputs a generated virtual
API key (the "virtual key") on setup start and that users must paste that
virtual key into Cursor's API key field (not their personal OpenAI key); mention
where in the CLI output the virtual key appears and preserve the existing note
about supported Cursor modes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 222e15bc-1acc-4058-97bd-918398d1fe43

📥 Commits

Reviewing files that changed from the base of the PR and between 6ae8ae6 and 37f5dc3.

📒 Files selected for processing (28)
  • .github/workflows/local-llm-proxy-lint.yml
  • .github/workflows/python-cli-quality.yml
  • .gitignore
  • .pre-commit-config.yaml
  • README.md
  • config/.env.example
  • config/.yamllint
  • config/docker-compose.yml
  • config/litellm-config.yaml
  • local_llm_proxy/__init__.py
  • local_llm_proxy/cli.py
  • local_llm_proxy/config.py
  • local_llm_proxy/logging_utils.py
  • local_llm_proxy/services/__init__.py
  • local_llm_proxy/services/models.py
  • local_llm_proxy/services/process_utils.py
  • local_llm_proxy/services/proxy.py
  • local_llm_proxy/services/validate.py
  • pyproject.toml
  • src/config/.yamllint
  • src/lib.sh
  • src/load-model.sh
  • src/proxy.sh
  • test/test-setup.sh
  • tests/test_cli.py
  • tests/test_models.py
  • tests/test_proxy.py
  • tests/test_validate.py
💤 Files with no reviewable changes (6)
  • src/lib.sh
  • src/config/.yamllint
  • src/load-model.sh
  • .github/workflows/local-llm-proxy-lint.yml
  • src/proxy.sh
  • test/test-setup.sh

Comment thread config/.env.example Outdated
Comment thread config/.yamllint Outdated
Comment thread config/litellm-config.yaml Outdated
Comment thread src/local_llm_proxy/cli.py
Comment thread src/local_llm_proxy/config.py Outdated
Comment thread src/local_llm_proxy/services/proxy.py Outdated
Comment thread src/local_llm_proxy/services/validate.py Outdated
Comment thread src/local_llm_proxy/services/validate.py
Comment thread README.md Outdated
Comment thread README.md Outdated
cursoragent and others added 2 commits April 20, 2026 10:00
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (6)
config/litellm-config.yaml (1)

2-2: ⚠️ Potential issue | 🟠 Major

Use a functional model alias instead of a redaction token.

Line 2 sets model_name to "[REDACTED].local", which is not a usable route and can break model resolution at runtime. This field needs a real alias that matches your expected request path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/litellm-config.yaml` at line 2, Replace the placeholder redaction
token in the config by setting the model_name field to a real, functional alias
(for example "my-model.local" or the actual route your runtime expects) so model
resolution can succeed; update the model_name entry in the configuration (the
model_name key in litellm-config.yaml) to a valid route/alias that matches your
service's request path.
config/.yamllint (1)

3-7: 🛠️ Refactor suggestion | 🟠 Major

Re-enable core YAML checks by default.

Disabling indentation and line-length globally (Lines 4-5) weakens lint signal for every YAML file. Keep these enabled by default and scope exceptions per-file only where needed.

Proposed tightening
 rules:
-  indentation: disable
-  line-length: disable
+  indentation:
+    spaces: consistent
+  line-length:
+    max: 120
   document-start: disable
   truthy: disable
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/.yamllint` around lines 3 - 7, The rules file currently disables core
checks "indentation" and "line-length" globally; update config/.yamllint so that
the "indentation" and "line-length" keys are not set to "disable" (remove or set
them to their defaults) and only keep or document exceptions per-file when
necessary; leave other entries like "document-start" and "truthy" as-is if you
want them intentionally disabled, but ensure "indentation" and "line-length" are
re-enabled at the global level so YAML linting enforces those checks by default.
config/.env.example (1)

9-12: ⚠️ Potential issue | 🟠 Major

Keep non-secret runtime defaults functional in .env.example.

Using [REDACTED] for OLLAMA_HOST, OLLAMA_MODEL, and LITELLM_OLLAMA_MODEL (Lines 9-12) makes the sample config unusable for setup/validation. These should be safe operational defaults, not redacted secrets.

Proposed fix
-OLLAMA_HOST=[REDACTED]
-OLLAMA_MODEL=[REDACTED]
+OLLAMA_HOST=http://localhost:11434
+OLLAMA_MODEL=gemma3:4b
 # LiteLLM route (litellm-config.yaml). Compose sets this from OLLAMA_MODEL; set explicitly if not using Docker Compose.
-LITELLM_OLLAMA_MODEL=[REDACTED]
+LITELLM_OLLAMA_MODEL=ollama/gemma3:4b
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/.env.example` around lines 9 - 12, The .env.example currently uses
non-functional placeholders; replace the three variables OLLAMA_HOST,
OLLAMA_MODEL, and LITELLM_OLLAMA_MODEL with safe, non-secret runtime defaults so
the sample config can be used for local setup/validation (e.g., set OLLAMA_HOST
to a local URL like http://localhost:11434, set OLLAMA_MODEL to a common local
model name used by Ollama, and set LITELLM_OLLAMA_MODEL to the same model name
or a sensible default) and ensure comments explain these are local/test defaults
and should be changed for production; update the lines for OLLAMA_HOST,
OLLAMA_MODEL, and LITELLM_OLLAMA_MODEL accordingly.
local_llm_proxy/services/process_utils.py (1)

7-35: ⚠️ Potential issue | 🟠 Major

Normalize subprocess failures in one path (run_command) and reuse it from output().

At Line 17, only subprocess.CalledProcessError is normalized; FileNotFoundError and timeout failures can surface raw exceptions.
At Line 29, output() bypasses run_command, so error behavior diverges.

🔧 Proposed fix
 def run_command(
-    command: list[str], *, capture_output: bool = False, error_prefix: str | None = None
+    command: list[str],
+    *,
+    capture_output: bool = False,
+    error_prefix: str | None = None,
+    timeout: float | None = None,
 ) -> subprocess.CompletedProcess[str]:
     try:
         result = subprocess.run(
             command,
             check=True,
             text=True,
             capture_output=capture_output,
+            timeout=timeout,
         )
+    except FileNotFoundError as exc:
+        cmd = command[0] if command else "<empty command>"
+        prefix = f"{error_prefix}: " if error_prefix else ""
+        raise RuntimeError(f"{prefix}Command not found: {cmd}") from exc
+    except subprocess.TimeoutExpired as exc:
+        prefix = f"{error_prefix}: " if error_prefix else ""
+        raise RuntimeError(f"{prefix}Command timed out: {' '.join(command)}") from exc
     except subprocess.CalledProcessError as exc:
         detail = exc.stderr or exc.stdout or str(exc)
         if error_prefix:
             raise RuntimeError(f"{error_prefix}: {detail}".strip()) from exc
         raise RuntimeError(detail.strip()) from exc
     return result
@@
 def output(command: list[str]) -> str:
-    return subprocess.run(
-        command,
-        check=True,
-        text=True,
-        capture_output=True,
-    ).stdout
+    return run_command(command, capture_output=True).stdout
#!/bin/bash
set -euo pipefail

# Verify current exception handling + wrapper usage in process utility.
nl -ba local_llm_proxy/services/process_utils.py | sed -n '1,140p'

python - <<'PY'
from pathlib import Path
src = Path("local_llm_proxy/services/process_utils.py").read_text()
print("Has FileNotFoundError handling:", "FileNotFoundError" in src)
print("Has TimeoutExpired handling:", "TimeoutExpired" in src)
print("output() delegates to run_command:", "return run_command(command, capture_output=True).stdout" in src)
PY
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@local_llm_proxy/services/process_utils.py` around lines 7 - 35, run_command
currently only normalizes subprocess.CalledProcessError, letting
FileNotFoundError and subprocess.TimeoutExpired escape; update run_command to
catch FileNotFoundError and subprocess.TimeoutExpired alongside
CalledProcessError and normalize them into RuntimeError with the same
error_prefix/detail behavior, and then change output() to delegate to
run_command(command, capture_output=True) and return the .stdout so both
functions share the same normalized error path; reference run_command and output
to find where to apply these changes.
README.md (2)

87-87: ⚠️ Potential issue | 🟡 Minor

Cursor auth guidance should use the CLI-printed virtual key.

Line 87 still points users to a generic configured key; this increases auth misconfiguration risk after setup start.

✏️ Proposed edit
-After `local-llm-proxy setup start`, use the printed ngrok URL. In Cursor: set **Override OpenAI Base URL** to your ngrok URL with `/cursor` appended (not `/v1`), and use your configured API key as documented in your LiteLLM setup. Supported Cursor modes may be limited (e.g. Ask/Plan); see Cursor docs for custom endpoints.
+After `local-llm-proxy setup start`, use the printed ngrok URL and printed **Virtual key**. In Cursor: set **Override OpenAI Base URL** to your ngrok URL with `/cursor` appended (not `/v1`), and use that virtual key for authentication. Supported Cursor modes may be limited (e.g. Ask/Plan); see Cursor docs for custom endpoints.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` at line 87, The README currently tells users to use the
"configured API key" for Cursor which is wrong; update the guidance after
"local-llm-proxy setup start" to instruct users to use the CLI-printed virtual
key (not the generic configured key) when setting Cursor's "Override OpenAI Base
URL" to the printed ngrok URL with "/cursor" appended (not "/v1") and to paste
the CLI-printed virtual key into Cursor's API key field; change the text around
the "local-llm-proxy setup start" and "Override OpenAI Base URL" instructions to
explicitly say "use the CLI-printed virtual key" to avoid misconfiguration.

21-21: ⚠️ Potential issue | 🟡 Minor

Mark ngrok as optional in prerequisites.

Line 21 currently reads as mandatory, but tunneling is optional for local-only usage.

✏️ Proposed edit
-4. **Ngrok account** and `NGROK_AUTHTOKEN` for public tunneling.
+4. **Optional:** Ngrok account and `NGROK_AUTHTOKEN` (only if you need public tunneling).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` at line 21, Update the README prerequisite item that currently
reads "Ngrok account and NGROK_AUTHTOKEN for public tunneling" to indicate ngrok
is optional; change the text (the list item mentioning "Ngrok" and
"NGROK_AUTHTOKEN") to explicitly mark it as optional and note that the auth
token is only required for public tunneling while local-only usage does not need
ngrok.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@local_llm_proxy/cli.py`:
- Around line 61-78: The three CLI subcommands models_add, models_remove, and
models_list currently call add_model, remove_model, and list_models directly and
can surface raw RuntimeError tracebacks; wrap each command body in a try/except
that catches RuntimeError and re-raises as click.ClickException with the error
message (consistent with proxy_setup and cli_validate), i.e., update functions
models_add, models_remove, and models_list to call their respective service
functions inside try/except RuntimeError as e: raise
click.ClickException(str(e)).

In `@tests/test_proxy.py`:
- Around line 99-111: The test manually reconstructs a Settings instance from
_settings(tmp_path) which is verbose and brittle; replace that manual
reconstruction by using dataclasses.replace to copy the existing settings and
set litellm_port="5000" (and any other modified fields) so new fields are
preserved. Locate the test_seed_virtual_key_urls_use_litellm_port function,
import or reference dataclasses.replace, and call replace(settings,
litellm_port="5000") instead of re-instantiating Settings by hand.

---

Duplicate comments:
In `@config/.env.example`:
- Around line 9-12: The .env.example currently uses non-functional placeholders;
replace the three variables OLLAMA_HOST, OLLAMA_MODEL, and LITELLM_OLLAMA_MODEL
with safe, non-secret runtime defaults so the sample config can be used for
local setup/validation (e.g., set OLLAMA_HOST to a local URL like
http://localhost:11434, set OLLAMA_MODEL to a common local model name used by
Ollama, and set LITELLM_OLLAMA_MODEL to the same model name or a sensible
default) and ensure comments explain these are local/test defaults and should be
changed for production; update the lines for OLLAMA_HOST, OLLAMA_MODEL, and
LITELLM_OLLAMA_MODEL accordingly.

In `@config/.yamllint`:
- Around line 3-7: The rules file currently disables core checks "indentation"
and "line-length" globally; update config/.yamllint so that the "indentation"
and "line-length" keys are not set to "disable" (remove or set them to their
defaults) and only keep or document exceptions per-file when necessary; leave
other entries like "document-start" and "truthy" as-is if you want them
intentionally disabled, but ensure "indentation" and "line-length" are
re-enabled at the global level so YAML linting enforces those checks by default.

In `@config/litellm-config.yaml`:
- Line 2: Replace the placeholder redaction token in the config by setting the
model_name field to a real, functional alias (for example "my-model.local" or
the actual route your runtime expects) so model resolution can succeed; update
the model_name entry in the configuration (the model_name key in
litellm-config.yaml) to a valid route/alias that matches your service's request
path.

In `@local_llm_proxy/services/process_utils.py`:
- Around line 7-35: run_command currently only normalizes
subprocess.CalledProcessError, letting FileNotFoundError and
subprocess.TimeoutExpired escape; update run_command to catch FileNotFoundError
and subprocess.TimeoutExpired alongside CalledProcessError and normalize them
into RuntimeError with the same error_prefix/detail behavior, and then change
output() to delegate to run_command(command, capture_output=True) and return the
.stdout so both functions share the same normalized error path; reference
run_command and output to find where to apply these changes.

In `@README.md`:
- Line 87: The README currently tells users to use the "configured API key" for
Cursor which is wrong; update the guidance after "local-llm-proxy setup start"
to instruct users to use the CLI-printed virtual key (not the generic configured
key) when setting Cursor's "Override OpenAI Base URL" to the printed ngrok URL
with "/cursor" appended (not "/v1") and to paste the CLI-printed virtual key
into Cursor's API key field; change the text around the "local-llm-proxy setup
start" and "Override OpenAI Base URL" instructions to explicitly say "use the
CLI-printed virtual key" to avoid misconfiguration.
- Line 21: Update the README prerequisite item that currently reads "Ngrok
account and NGROK_AUTHTOKEN for public tunneling" to indicate ngrok is optional;
change the text (the list item mentioning "Ngrok" and "NGROK_AUTHTOKEN") to
explicitly mark it as optional and note that the auth token is only required for
public tunneling while local-only usage does not need ngrok.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 27be6dbe-f210-4a30-94ac-9e7f3432c819

📥 Commits

Reviewing files that changed from the base of the PR and between 6ae8ae6 and 37f5dc3.

📒 Files selected for processing (28)
  • .github/workflows/local-llm-proxy-lint.yml
  • .github/workflows/python-cli-quality.yml
  • .gitignore
  • .pre-commit-config.yaml
  • README.md
  • config/.env.example
  • config/.yamllint
  • config/docker-compose.yml
  • config/litellm-config.yaml
  • local_llm_proxy/__init__.py
  • local_llm_proxy/cli.py
  • local_llm_proxy/config.py
  • local_llm_proxy/logging_utils.py
  • local_llm_proxy/services/__init__.py
  • local_llm_proxy/services/models.py
  • local_llm_proxy/services/process_utils.py
  • local_llm_proxy/services/proxy.py
  • local_llm_proxy/services/validate.py
  • pyproject.toml
  • src/config/.yamllint
  • src/lib.sh
  • src/load-model.sh
  • src/proxy.sh
  • test/test-setup.sh
  • tests/test_cli.py
  • tests/test_models.py
  • tests/test_proxy.py
  • tests/test_validate.py
💤 Files with no reviewable changes (6)
  • src/config/.yamllint
  • src/lib.sh
  • src/load-model.sh
  • .github/workflows/local-llm-proxy-lint.yml
  • src/proxy.sh
  • test/test-setup.sh

Comment thread src/local_llm_proxy/cli.py
Comment thread tests/test_proxy.py
Co-authored-by: Oliver L <se02035@users.noreply.github.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (11)
config/.yamllint (1)

3-7: 🧹 Nitpick | 🔵 Trivial

Globally disabled rules reduce linting effectiveness.

Disabling indentation and line-length at the global scope removes much of yamllint's value. These rules help maintain consistent YAML formatting. Consider enabling them with reasonable limits and using per-file overrides only where necessary.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/.yamllint` around lines 3 - 7, The global .yamllint rules currently
disable critical checks (indentation, line-length) which weakens linting; update
the rules block by re-enabling at least indentation and line-length with
sensible defaults (e.g., indentation: spaces/2 or 4, line-length: max 80 or 120)
and keep document-start/truthy as needed, and move any file-specific exceptions
into per-file overrides using the "extends" or "ignore" mechanism so the main
rules enforce consistent YAML formatting while only exempting specific files.
config/.env.example (1)

9-12: ⚠️ Potential issue | 🟠 Major

Redacted operational settings will cause validation failures.

OLLAMA_HOST, OLLAMA_MODEL, and LITELLM_OLLAMA_MODEL (Lines 9-12) are operational configuration values, not secrets. Using [REDACTED] as a placeholder makes this example non-functional. Consider providing working defaults (e.g., http://localhost:11434, gemma3:12b) so users can copy this file and have a functional starting configuration, while keeping actual secrets like LITELLM_MASTER_KEY redacted.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/.env.example` around lines 9 - 12, The example .env uses “[REDACTED]”
for operational (non-secret) settings (OLLAMA_HOST, OLLAMA_MODEL,
LITELLM_OLLAMA_MODEL), which breaks validation; update the .env.example to
provide sensible, functional defaults (e.g., a local host URL for OLLAMA_HOST
and a default model name for OLLAMA_MODEL and LITELLM_OLLAMA_MODEL) so users can
copy it and run locally, while still leaving true secrets (like
LITELLM_MASTER_KEY) redacted—modify the entries for OLLAMA_HOST, OLLAMA_MODEL,
and LITELLM_OLLAMA_MODEL accordingly.
config/litellm-config.yaml (1)

2-2: ⚠️ Potential issue | 🟠 Major

Model alias mismatch will cause validation failures.

The hardcoded alias "[REDACTED].local" on Line 2 won't match the dynamic alias constructed by validate_setup, which builds f"{OLLAMA_MODEL}.local" from the environment variable. Unless OLLAMA_MODEL is literally set to "[REDACTED]", runtime validation will fail to resolve this model.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/litellm-config.yaml` at line 2, The model_name in
config/litellm-config.yaml is hardcoded to "[REDACTED].local" which will not
match the dynamic alias built by validate_setup using f"{OLLAMA_MODEL}.local";
update model_name to use the same dynamic value (i.e., set it to the
environment-driven alias or make it configurable) so it matches the alias
constructed from the OLLAMA_MODEL env var used in validate_setup, and verify
code that references OLLAMA_MODEL and validate_setup aligns with that config
key.
README.md (1)

21-21: ⚠️ Potential issue | 🟡 Minor

Prerequisites still conflict with optional ngrok positioning.

Line 21 lists ngrok as required, but Line 3 frames tunneling as optional. Please mark ngrok/token as optional for local-only usage.

Suggested edit
-4. **Ngrok account** and `NGROK_AUTHTOKEN` for public tunneling.
+4. **Optional:** Ngrok account and `NGROK_AUTHTOKEN` (only required for public tunneling).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` at line 21, The README currently lists "Ngrok account" and
NGROK_AUTHTOKEN as required while earlier text indicates tunneling is optional;
update the prerequisites to mark ngrok and NGROK_AUTHTOKEN as optional for
local-only usage by changing the item "Ngrok account and NGROK_AUTHTOKEN" to
explicitly state "(optional, only required for public tunneling)" and/or move it
into an "Optional prerequisites" subsection; ensure references to tunneling
elsewhere (the earlier line that frames tunneling as optional) remain consistent
with this change and mention NGROK_AUTHTOKEN by name so readers know when it's
needed.
src/local_llm_proxy/services/process_utils.py (1)

7-35: ⚠️ Potential issue | 🟠 Major

Subprocess error normalization is still incomplete and inconsistent.

run_command (Line 7–22) still misses FileNotFoundError/TimeoutExpired, and output (Line 29–35) bypasses run_command, so callers can still receive raw subprocess exceptions.

Suggested patch
 def run_command(
-    command: list[str], *, capture_output: bool = False, error_prefix: str | None = None
+    command: list[str],
+    *,
+    capture_output: bool = False,
+    error_prefix: str | None = None,
+    timeout: float | None = None,
 ) -> subprocess.CompletedProcess[str]:
     try:
         result = subprocess.run(
             command,
             check=True,
             text=True,
             capture_output=capture_output,
+            timeout=timeout,
         )
+    except FileNotFoundError as exc:
+        cmd = command[0] if command else "<empty command>"
+        prefix = f"{error_prefix}: " if error_prefix else ""
+        raise RuntimeError(f"{prefix}Command not found: {cmd}") from exc
+    except subprocess.TimeoutExpired as exc:
+        prefix = f"{error_prefix}: " if error_prefix else ""
+        raise RuntimeError(f"{prefix}Command timed out: {' '.join(command)}") from exc
     except subprocess.CalledProcessError as exc:
         detail = exc.stderr or exc.stdout or str(exc)
         if error_prefix:
             raise RuntimeError(f"{error_prefix}: {detail}".strip()) from exc
         raise RuntimeError(detail.strip()) from exc
     return result
@@
 def output(command: list[str]) -> str:
-    return subprocess.run(
-        command,
-        check=True,
-        text=True,
-        capture_output=True,
-    ).stdout
+    return run_command(command, capture_output=True).stdout
#!/bin/bash
rg -n 'def run_command|except .*CalledProcessError|FileNotFoundError|TimeoutExpired|def output|subprocess\.run\(' src/local_llm_proxy/services/process_utils.py

Expected result after fix: catches for FileNotFoundError and TimeoutExpired present, and output() no longer calls subprocess.run directly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/local_llm_proxy/services/process_utils.py` around lines 7 - 35,
run_command currently only handles CalledProcessError and output calls
subprocess.run directly; update run_command to also catch FileNotFoundError and
subprocess.TimeoutExpired, normalize them into RuntimeError with the same
error_prefix behavior (include exception message and preserve cause), and ensure
output() delegates to run_command(command, capture_output=True) and returns the
.stdout from the CompletedProcess returned by run_command; reference run_command
and output to locate and modify the logic and remove direct subprocess.run usage
in output.
src/local_llm_proxy/services/validate.py (2)

17-22: ⚠️ Potential issue | 🟠 Major

Avoid broad URL string replacement in host normalization.

Line 18–Line 21 can rewrite unrelated URL portions (path/query/credentials). Only the hostname should be remapped.

💡 Proposed fix
+from urllib.parse import urlsplit, urlunsplit
@@
 def _normalize_ping_host(ollama_host: str) -> str:
-    if "host.docker.internal" in ollama_host:
-        return ollama_host.replace("host.docker.internal", "localhost")
-    if "ollama" in ollama_host:
-        return ollama_host.replace("ollama", "localhost")
-    return ollama_host
+    parts = urlsplit(ollama_host)
+    host = (parts.hostname or "").lower()
+    if host not in {"host.docker.internal", "ollama"}:
+        return ollama_host
+
+    auth = ""
+    if parts.username:
+        auth = parts.username
+        if parts.password:
+            auth += f":{parts.password}"
+        auth += "@"
+    port = f":{parts.port}" if parts.port else ""
+    netloc = f"{auth}localhost{port}"
+    return urlunsplit((parts.scheme, netloc, parts.path, parts.query, parts.fragment))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/local_llm_proxy/services/validate.py` around lines 17 - 22, The
_normalize_ping_host function is performing blind string replacements on the
entire ollama_host which can change paths, queries or credentials; instead parse
the input as a URL, replace only the hostname portion when it equals
"host.docker.internal" or contains "ollama", then reconstruct the URL preserving
scheme, port, path, query and userinfo. Update _normalize_ping_host to use a URL
parser (e.g., urllib.parse.urlparse/urlunparse) to extract and modify the
netloc/hostname only, handle inputs that omit a scheme by defaulting to "http"
for parsing, and return the rebuilt URL with only the host portion remapped.

65-68: ⚠️ Potential issue | 🟠 Major

Guard completion payload shape before indexing choices[0].

Line 66 can raise IndexError on empty/malformed responses, bypassing your explicit RuntimeError contract.

💡 Proposed fix
     payload = response.json()
-    content = payload.get("choices", [{}])[0].get("message", {}).get("content")
+    choices = payload.get("choices")
+    if not isinstance(choices, list) or not choices:
+        raise RuntimeError(f"Failed to get valid response from proxy. Full response: {payload}")
+    first = choices[0] if isinstance(choices[0], dict) else {}
+    message = first.get("message", {}) if isinstance(first, dict) else {}
+    content = message.get("content") if isinstance(message, dict) else None
     if not content:
         raise RuntimeError(f"Failed to get valid response from proxy. Full response: {payload}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/local_llm_proxy/services/validate.py` around lines 65 - 68, The current
code indexes choices[0] directly which can raise IndexError on empty/malformed
responses; instead, after payload = response.json() validate that
payload.get("choices") is a list with at least one element (e.g., choices =
payload.get("choices"); if not isinstance(choices, list) or not choices: raise
RuntimeError(...)), then extract content = choices[0].get("message",
{}).get("content") and keep the existing RuntimeError that includes the full
payload; update the block around response.json(), payload, choices and content
to perform this guarded check before indexing.
src/local_llm_proxy/services/proxy.py (2)

100-117: ⚠️ Potential issue | 🟠 Major

Wrap virtual-key seeding request/JSON failures in a RuntimeError boundary.

Line 102–Line 117 can raise raw request/parse exceptions, which bypasses the service-level error contract and degrades CLI error UX.

💡 Proposed fix
     headers = {"Authorization": f"Bearer {settings.litellm_master_key}"}
     base = f"http://localhost:{settings.litellm_port}"
-    models_response = requests.get(f"{base}/v1/models", headers=headers, timeout=5)
-    models_response.raise_for_status()
-    model_ids = [
-        item.get("id")
-        for item in models_response.json().get("data", [])
-        if item.get("id")
-    ]
-    payload = {"models": model_ids, "key_alias": "local-proxy-key"}
-    key_response = requests.post(
-        f"{base}/key/generate",
-        headers={**headers, "Content-Type": "application/json"},
-        json=payload,
-        timeout=5,
-    )
-    key_response.raise_for_status()
-    virtual_key = key_response.json().get("key")
+    try:
+        models_response = requests.get(f"{base}/v1/models", headers=headers, timeout=5)
+        models_response.raise_for_status()
+        model_ids = [
+            item.get("id")
+            for item in models_response.json().get("data", [])
+            if item.get("id")
+        ]
+        payload = {"models": model_ids, "key_alias": "local-proxy-key"}
+        key_response = requests.post(
+            f"{base}/key/generate",
+            headers={**headers, "Content-Type": "application/json"},
+            json=payload,
+            timeout=5,
+        )
+        key_response.raise_for_status()
+        virtual_key = key_response.json().get("key")
+    except (requests.RequestException, ValueError, TypeError) as exc:
+        raise RuntimeError("Failed to seed LiteLLM virtual key.") from exc
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/local_llm_proxy/services/proxy.py` around lines 100 - 117, Wrap the HTTP
calls and JSON parsing that seed the virtual key (the block that builds
headers/base, calls requests.get to fetch models into models_response,
constructs payload, posts to /key/generate into key_response, and reads
virtual_key) in a try/except that catches request exceptions
(requests.RequestException) and JSON/Key errors (ValueError, KeyError,
TypeError), and re-raise a RuntimeError with a clear message including context
(e.g., "failed to seed virtual key") and the original exception message; ensure
you reference models_response, key_response and virtual_key in the error context
so callers receive a consistent service-level RuntimeError instead of raw
parsing/request exceptions.

124-126: ⚠️ Potential issue | 🟠 Major

Persist virtual key with restrictive filesystem permissions.

Line 124–Line 126 writes credential material without explicit hardening; on permissive umask this can leak secrets.

💡 Proposed fix
 def _write_virtual_key(path: Path, value: str) -> None:
-    path.parent.mkdir(parents=True, exist_ok=True)
+    path.parent.mkdir(parents=True, exist_ok=True, mode=0o700)
     path.write_text(f"{value}\n", encoding="utf-8")
+    path.chmod(0o600)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/local_llm_proxy/services/proxy.py` around lines 124 - 126, The
_write_virtual_key function currently writes sensitive credential material
without setting filesystem permissions; update it to create the parent directory
with restrictive mode (e.g., 0o700) and write the file with restrictive
permissions (e.g., 0o600) using an atomic write (write to a temp file then
os.replace) or an os.open+os.write call that sets mode atomically, and ensure
you call os.fsync before replace; adjust references to Path and
_write_virtual_key accordingly and handle any exceptions to avoid leaving
loose-permission artifacts.
src/local_llm_proxy/cli.py (1)

35-40: ⚠️ Potential issue | 🟠 Major

Apply the same RuntimeError→ClickException handling in setup stop.

Line 39 bypasses the CLI error contract used by start/restart/validate, so operational failures may surface as raw exceptions.

💡 Proposed fix
 `@setup.command`("stop")
 def setup_stop() -> None:
     """Stop proxy services."""
     settings = load_settings()
-    stop_proxy(settings)
+    try:
+        stop_proxy(settings)
+    except RuntimeError as exc:
+        err(str(exc))
+        raise click.ClickException(str(exc)) from exc
     click.echo("Teardown complete.")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/local_llm_proxy/cli.py` around lines 35 - 40, The setup_stop function
currently calls stop_proxy(settings) and lets RuntimeError surface directly;
update setup_stop to mirror start/restart/validate by wrapping stop_proxy in a
try/except that catches RuntimeError (and any other operational exceptions you
want standardized) and raises click.ClickException with a clear message
(including the original error text) so CLI consumers get a ClickException
instead of raw exceptions; ensure click is imported and rethrow like: except
RuntimeError as e: raise click.ClickException(f"Failed to stop proxy: {e}").
src/local_llm_proxy/config.py (1)

40-43: ⚠️ Potential issue | 🟠 Major

Normalize redaction placeholders before building Settings.

Line 40–Line 43 currently accept placeholder tokens as real runtime values (e.g. [REDACTED]), which can pass truthiness checks and trigger avoidable auth/network failures downstream.

💡 Proposed fix
 def load_settings() -> Settings:
@@
-    return Settings(
+    def _clean(value: object, default: str = "") -> str:
+        text = str(value or "").strip()
+        if not text or text in {"[REDACTED]", "<REDACTED>", "REDACTED"}:
+            return default
+        return text
+
+    return Settings(
@@
-        ollama_model=str(env_values.get("OLLAMA_MODEL", "gemma3:12b")),
-        ollama_host=str(env_values.get("OLLAMA_HOST", "http://localhost:11434")),
-        litellm_port=str(env_values.get("LITELLM_PORT", "4000")),
-        litellm_master_key=str(env_values.get("LITELLM_MASTER_KEY", "")),
+        ollama_model=_clean(env_values.get("OLLAMA_MODEL"), "gemma3:12b"),
+        ollama_host=_clean(env_values.get("OLLAMA_HOST"), "http://localhost:11434"),
+        litellm_port=_clean(env_values.get("LITELLM_PORT"), "4000"),
+        litellm_master_key=_clean(env_values.get("LITELLM_MASTER_KEY"), ""),
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/local_llm_proxy/config.py` around lines 40 - 43, The Settings
construction currently uses raw env values (ollama_model, ollama_host,
litellm_port, litellm_master_key) which may contain redaction placeholders like
"[REDACTED]" and thus pass truthy checks; add a small normalizer (e.g.,
normalize_env_value) that treats common redaction tokens ("[REDACTED]",
"REDACTED", "<REDACTED>", "redacted") and empty/whitespace-only strings as
None/"" and apply it to each env_values.get(...) call before passing into the
Settings/constructor (references: ollama_model, ollama_host, litellm_port,
litellm_master_key) so downstream auth/network logic sees a clean missing value
rather than a placeholder.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.pre-commit-config.yaml:
- Around line 4-5: Add the ruff-format pre-commit hook next to the existing
ruff-check entry so formatting runs automatically; update the hooks list to
include a new item with id: ruff-format (alongside the current id: ruff-check)
and ensure any required args or stages mirror your ruff-check configuration so
formatting behavior is consistent across pre-commit runs.
- Line 3: The pre-commit hook for Ruff is pinned to rev: v0.14.0; update that
rev value to v0.15.10 in .pre-commit-config.yaml (the ruff-pre-commit hook
entry) or run pre-commit autoupdate --repo
https://github.com/charliermarsh/ruff-pre-commit --hook ruff to set the new
release; after updating, run pre-commit install and pre-commit run --all-files
and run your project's test suite to ensure the newer ruff version is compatible
with your Python version and existing rules.

In `@config/.env.example`:
- Around line 6-19: Reorder environment variable keys alphabetically within each
logical section (e.g., under "Ollama configuration" sort OLLAMA_HOST,
OLLAMA_MODEL; under the LiteLLM section sort LITELLM_MASTER_KEY,
LITELLM_OLLAMA_MODEL; under "LiteLLM UI Credentials" sort UI_PASSWORD,
UI_USERNAME; under "Database Credentials" sort POSTGRES_PASSWORD, POSTGRES_USER)
while preserving existing comments and section groupings so the .env.example
remains functionally identical but avoids UnorderedKey warnings from static
analysis.

In `@pyproject.toml`:
- Around line 7-22: The dependencies listed in pyproject.toml (the dependency
entries "click", "python-dotenv", "requests" and the dev entries "pytest",
"ruff", "pre-commit") are unpinned, causing non-deterministic CI installs;
update the dependencies and [project.optional-dependencies] dev list to include
explicit version specifiers (either exact pins like == or bounded ranges like >=
and <) for each package name referenced so CI installs are deterministic and
reproducible (e.g., replace the unpinned "click", "python-dotenv", "requests",
"pytest", "ruff", and "pre-commit" entries with pinned/versioned specifiers).

In `@src/local_llm_proxy/config.py`:
- Around line 24-31: _repo_root() currently returns
Path(__file__).resolve().parent.parent which yields the src/ directory in
src-layout repos, causing load_settings() to use src/config; update _repo_root()
to reliably locate the repository root (used by load_settings(), repo_root,
config_dir, env_file) by walking up parent directories until a repository marker
is found (e.g., .git, pyproject.toml, setup.cfg) and returning that directory,
with a fallback of Path(__file__).resolve().parents[2] if no marker is found.

---

Duplicate comments:
In `@config/.env.example`:
- Around line 9-12: The example .env uses “[REDACTED]” for operational
(non-secret) settings (OLLAMA_HOST, OLLAMA_MODEL, LITELLM_OLLAMA_MODEL), which
breaks validation; update the .env.example to provide sensible, functional
defaults (e.g., a local host URL for OLLAMA_HOST and a default model name for
OLLAMA_MODEL and LITELLM_OLLAMA_MODEL) so users can copy it and run locally,
while still leaving true secrets (like LITELLM_MASTER_KEY) redacted—modify the
entries for OLLAMA_HOST, OLLAMA_MODEL, and LITELLM_OLLAMA_MODEL accordingly.

In `@config/.yamllint`:
- Around line 3-7: The global .yamllint rules currently disable critical checks
(indentation, line-length) which weakens linting; update the rules block by
re-enabling at least indentation and line-length with sensible defaults (e.g.,
indentation: spaces/2 or 4, line-length: max 80 or 120) and keep
document-start/truthy as needed, and move any file-specific exceptions into
per-file overrides using the "extends" or "ignore" mechanism so the main rules
enforce consistent YAML formatting while only exempting specific files.

In `@config/litellm-config.yaml`:
- Line 2: The model_name in config/litellm-config.yaml is hardcoded to
"[REDACTED].local" which will not match the dynamic alias built by
validate_setup using f"{OLLAMA_MODEL}.local"; update model_name to use the same
dynamic value (i.e., set it to the environment-driven alias or make it
configurable) so it matches the alias constructed from the OLLAMA_MODEL env var
used in validate_setup, and verify code that references OLLAMA_MODEL and
validate_setup aligns with that config key.

In `@README.md`:
- Line 21: The README currently lists "Ngrok account" and NGROK_AUTHTOKEN as
required while earlier text indicates tunneling is optional; update the
prerequisites to mark ngrok and NGROK_AUTHTOKEN as optional for local-only usage
by changing the item "Ngrok account and NGROK_AUTHTOKEN" to explicitly state
"(optional, only required for public tunneling)" and/or move it into an
"Optional prerequisites" subsection; ensure references to tunneling elsewhere
(the earlier line that frames tunneling as optional) remain consistent with this
change and mention NGROK_AUTHTOKEN by name so readers know when it's needed.

In `@src/local_llm_proxy/cli.py`:
- Around line 35-40: The setup_stop function currently calls
stop_proxy(settings) and lets RuntimeError surface directly; update setup_stop
to mirror start/restart/validate by wrapping stop_proxy in a try/except that
catches RuntimeError (and any other operational exceptions you want
standardized) and raises click.ClickException with a clear message (including
the original error text) so CLI consumers get a ClickException instead of raw
exceptions; ensure click is imported and rethrow like: except RuntimeError as e:
raise click.ClickException(f"Failed to stop proxy: {e}").

In `@src/local_llm_proxy/config.py`:
- Around line 40-43: The Settings construction currently uses raw env values
(ollama_model, ollama_host, litellm_port, litellm_master_key) which may contain
redaction placeholders like "[REDACTED]" and thus pass truthy checks; add a
small normalizer (e.g., normalize_env_value) that treats common redaction tokens
("[REDACTED]", "REDACTED", "<REDACTED>", "redacted") and empty/whitespace-only
strings as None/"" and apply it to each env_values.get(...) call before passing
into the Settings/constructor (references: ollama_model, ollama_host,
litellm_port, litellm_master_key) so downstream auth/network logic sees a clean
missing value rather than a placeholder.

In `@src/local_llm_proxy/services/process_utils.py`:
- Around line 7-35: run_command currently only handles CalledProcessError and
output calls subprocess.run directly; update run_command to also catch
FileNotFoundError and subprocess.TimeoutExpired, normalize them into
RuntimeError with the same error_prefix behavior (include exception message and
preserve cause), and ensure output() delegates to run_command(command,
capture_output=True) and returns the .stdout from the CompletedProcess returned
by run_command; reference run_command and output to locate and modify the logic
and remove direct subprocess.run usage in output.

In `@src/local_llm_proxy/services/proxy.py`:
- Around line 100-117: Wrap the HTTP calls and JSON parsing that seed the
virtual key (the block that builds headers/base, calls requests.get to fetch
models into models_response, constructs payload, posts to /key/generate into
key_response, and reads virtual_key) in a try/except that catches request
exceptions (requests.RequestException) and JSON/Key errors (ValueError,
KeyError, TypeError), and re-raise a RuntimeError with a clear message including
context (e.g., "failed to seed virtual key") and the original exception message;
ensure you reference models_response, key_response and virtual_key in the error
context so callers receive a consistent service-level RuntimeError instead of
raw parsing/request exceptions.
- Around line 124-126: The _write_virtual_key function currently writes
sensitive credential material without setting filesystem permissions; update it
to create the parent directory with restrictive mode (e.g., 0o700) and write the
file with restrictive permissions (e.g., 0o600) using an atomic write (write to
a temp file then os.replace) or an os.open+os.write call that sets mode
atomically, and ensure you call os.fsync before replace; adjust references to
Path and _write_virtual_key accordingly and handle any exceptions to avoid
leaving loose-permission artifacts.

In `@src/local_llm_proxy/services/validate.py`:
- Around line 17-22: The _normalize_ping_host function is performing blind
string replacements on the entire ollama_host which can change paths, queries or
credentials; instead parse the input as a URL, replace only the hostname portion
when it equals "host.docker.internal" or contains "ollama", then reconstruct the
URL preserving scheme, port, path, query and userinfo. Update
_normalize_ping_host to use a URL parser (e.g.,
urllib.parse.urlparse/urlunparse) to extract and modify the netloc/hostname
only, handle inputs that omit a scheme by defaulting to "http" for parsing, and
return the rebuilt URL with only the host portion remapped.
- Around line 65-68: The current code indexes choices[0] directly which can
raise IndexError on empty/malformed responses; instead, after payload =
response.json() validate that payload.get("choices") is a list with at least one
element (e.g., choices = payload.get("choices"); if not isinstance(choices,
list) or not choices: raise RuntimeError(...)), then extract content =
choices[0].get("message", {}).get("content") and keep the existing RuntimeError
that includes the full payload; update the block around response.json(),
payload, choices and content to perform this guarded check before indexing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d6639344-5949-4576-bdb8-5ca2decc57b6

📥 Commits

Reviewing files that changed from the base of the PR and between 6ae8ae6 and 913e94c.

📒 Files selected for processing (29)
  • .coderabbit.yaml
  • .github/workflows/local-llm-proxy-lint.yml
  • .github/workflows/python-cli-quality.yml
  • .gitignore
  • .pre-commit-config.yaml
  • README.md
  • config/.env.example
  • config/.yamllint
  • config/docker-compose.yml
  • config/litellm-config.yaml
  • pyproject.toml
  • src/config/.yamllint
  • src/lib.sh
  • src/load-model.sh
  • src/local_llm_proxy/__init__.py
  • src/local_llm_proxy/cli.py
  • src/local_llm_proxy/config.py
  • src/local_llm_proxy/logging_utils.py
  • src/local_llm_proxy/services/__init__.py
  • src/local_llm_proxy/services/models.py
  • src/local_llm_proxy/services/process_utils.py
  • src/local_llm_proxy/services/proxy.py
  • src/local_llm_proxy/services/validate.py
  • src/proxy.sh
  • test/test-setup.sh
  • tests/test_cli.py
  • tests/test_models.py
  • tests/test_proxy.py
  • tests/test_validate.py
💤 Files with no reviewable changes (6)
  • src/load-model.sh
  • .github/workflows/local-llm-proxy-lint.yml
  • test/test-setup.sh
  • src/proxy.sh
  • src/lib.sh
  • src/config/.yamllint

Comment thread .pre-commit-config.yaml Outdated
Comment thread .pre-commit-config.yaml
Comment thread config/.env.example Outdated
Comment thread pyproject.toml
Comment thread src/local_llm_proxy/config.py
cursoragent and others added 9 commits April 20, 2026 10:40
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
cursoragent and others added 5 commits April 20, 2026 11:06
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
Co-authored-by: Oliver L <se02035@users.noreply.github.com>
@se02035 se02035 merged commit c7df307 into dev Apr 20, 2026
4 checks passed
@coderabbitai coderabbitai Bot mentioned this pull request Apr 20, 2026
Merged
@se02035 se02035 deleted the cursor/click-cli-migration-2489 branch May 18, 2026 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants