Skip to content

[MISC] Decommission prompt-service, old tools, SDK1 prompt module#1978

Merged
harini-venkataraman merged 20 commits into
mainfrom
feat/phase5-decommission-old-components
Jun 29, 2026
Merged

[MISC] Decommission prompt-service, old tools, SDK1 prompt module#1978
harini-venkataraman merged 20 commits into
mainfrom
feat/phase5-decommission-old-components

Conversation

@harini-venkataraman

@harini-venkataraman harini-venkataraman commented May 20, 2026

Copy link
Copy Markdown
Contributor

What

Phase 5 of the pluggable executor migration — decommission prompt-service, old tools (classifier, structure, text_extractor), and SDK1 prompt module from the OSS repo.

Why

These components have been fully replaced by the executor-based architecture (Phases 1–4). The prompt-service Flask app, old tool containers, and SDK1 prompt module are dead code that adds maintenance burden and CI cost.

How

  • prompt-service/: Entire Flask service removed (controllers, retrievers, indexing, plugins). All functionality now lives in `workers/executor/`.
  • tools/classifier, tools/structure, tools/text_extractor: Old tool containers removed. Structure tool routing preserved via `STRUCTURE_TOOL_IMAGE_*` env vars.
  • unstract/sdk1/prompt.py: Dead module removed (executor uses `answer_prompt.py` directly).
  • tox.ini: Removed `prompt-service` test environment (directory no longer exists).
  • Docker: Removed `prompt.Dockerfile`, compose service blocks, debug ports.
  • CI: Removed prompt-service from `production-build.yaml` matrix and old tools from `docker-tools-build-push.yaml`.
  • Backend/config: Removed `PROMPT_HOST`/`PROMPT_PORT` from settings, sample envs, workflow-execution constants.

Safety — preserved items

Item Why
`STRUCTURE_TOOL_IMAGE_*` (3 keys) Used for structure tool routing
`REMOTE_PROMPT_STUDIO_FILE_PATH` Prompt Studio data path
`workers/plugins/` Active executor plugins
`PromptIdeBaseTool` Still used for IDE indexing

Can this PR break any existing features? If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

No. All removed components are dead code — prompt-service was replaced by executor workers in Phases 1-4, old tool containers are unused (structure tool routing uses image env vars, not source), and SDK1 prompt.py had no remaining callers. Verified: zero import references to deleted modules, 263 workers tests pass with no regressions.

Relevant Docs

  • `architecture-migration-phases.md` (repo root) — Phase 5 plan

Related Issues or PRs

Dependencies Versions / Env Variables

Removed env vars:

No new dependencies or env vars added.

Notes on Testing

  • Workers tests: 263 pass (no regressions)
  • Dangling reference scan: 0 matches for deleted modules
  • tox.ini updated to remove prompt-service test env
  • Docker compose validated (removed services, updated depends_on)

Screenshots

N/A — no UI changes.

Checklist

I have read and understood the Contribution Guidelines.

harini-venkataraman and others added 2 commits May 19, 2026 16:39
… (Phase 5)

Remove prompt-service source, Dockerfiles, and docker-compose entries.
Remove tools/classifier, tools/structure, tools/text_extractor directories.
Remove SDK1 prompt.py module and its tests.
Clean up PROMPT_HOST/PROMPT_PORT from backend settings, sample envs,
docker configs, and CI workflows. Remove prompt-service from uv-lock
scripts and production build workflow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The prompt-service directory was deleted in the prior commit but tox.ini
still referenced it, which would break CI test runs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

This PR removes the prompt-service microservice and tools/structure, updates CI/CD workflows and Docker composition, and replaces PROMPT_* environment wiring with PLATFORM_* across backend, workers, SDK, and tooling scripts.

Changes

Service Removal and Redirection

Layer / File(s) Summary
CI/CD, uv-lock, and tox updates
.github/workflows/docker-tools-build-push.yaml, .github/workflows/production-build.yaml, docker/scripts/uv-lock-gen/*, tox.ini
Workflow dispatch choices and build-matrix updated (prompt-service removed, runner added); build TOTAL_SERVICES adjusted; uv-lock default directories/README updated; tox prompt-service test env removed.
Docker compose and build ordering
docker/compose.debug.yaml, docker/docker-compose.yaml, docker/sample.compose.override.yaml, docker/docker-compose.build.yaml
Removed prompt-service debug port and service definitions; backend depends_on replaced with x2text-service; platform-service and x2text-service reordered earlier in build composition; compose override cleaned.
Backend and worker environment changes
backend/backend/settings/base.py, backend/sample.env, workers/sample.env, unstract/workflow-execution/src/unstract/workflow_execution/constants.py, unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py
Replaced PROMPT_HOST/PORT with PLATFORM_HOST/PORT in settings and env samples; ToolRuntimeVariable and ToolsUtils stop declaring/injecting PROMPT_* into tool envs.
SDK1 retry and tests
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py, unstract/sdk1/tests/*
Preconfigured retry decorator switched from PROMPT_SERVICE to PLATFORM_SERVICE; related tests and fixture env cleanup updated/removed.
Prompt-service removal (app code, utils, tests)
prompt-service/src/..., prompt-service/pyproject.toml, prompt-service/entrypoint.sh, prompt-service/*
Controllers, services, retrievers, DTOs, helpers, extensions, utils, tests, and packaging/entrypoint files for prompt-service were removed or emptied.
Tools/structure removal
tools/structure/*
Structure tool sources, Dockerfile, configs, and docs removed/cleared; associated dockerignore/gitignore entries adjusted.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: decommissioning prompt-service, old tools, and SDK1 prompt module as part of Phase 5 of the executor migration.
Description check ✅ Passed The description is comprehensive and well-structured, covering what, why, how, safety considerations, testing, and breaking changes with thorough detail.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/phase5-decommission-old-components

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@greptile-apps

greptile-apps Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Phase 5 of the executor-migration decommissions the prompt-service Flask app, the tools/structure container, and the sdk1/prompt.py module. All related Docker, CI, env-var, and test scaffolding is cleaned up across 96 files.

  • prompt-service removed: Docker service, Dockerfile, compose blocks, debug ports, CI build-matrix entry, tox env, and all PROMPT_HOST/PROMPT_PORT references are consistently gone from backend settings, worker env, and ToolsUtils.
  • tools/structure removed: Dockerfile, source, and all version-bump / uv-lock-gen references removed; STRUCTURE_TOOL_IMAGE_* env vars are intentionally preserved for runtime routing.
  • sdk1/prompt.py and retry decorator removed: PromptTool, retry_prompt_service_call, and their tests are deleted; retry_utils.py docstrings updated accordingly.

Confidence Score: 5/5

This is a clean decommission of dead code — no runtime paths are affected because the executor-based replacements were already in production before this PR.

All removed components (prompt-service, tools/structure, sdk1/prompt.py) had no remaining callers per the PR's dangling-reference scan, and the env-var removals are matched consistently across backend settings, worker env, ToolsUtils, and constants. The only minor surprise is an undocumented image-name normalization for the text-extractor build artifact.

docker/docker-compose.build.yaml — the image name for tool-text_extractor changed from underscore to hyphen, which is undocumented and could affect anyone referencing the old image name locally or in downstream scripts.

Important Files Changed

Filename Overview
docker/docker-compose.build.yaml Removes prompt-service and tool-structure build entries; renames tool-text_extractor image from unstract/tool-text_extractor to unstract/tool-text-extractor (underscore→hyphen). The image rename may silently break local build scripts referencing the old name.
unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py Removes PROMPT_HOST/PROMPT_PORT reads (raise_exception=True) and their injection into tool container env vars. Clean removal matching the decommissioned prompt-service.
unstract/workflow-execution/src/unstract/workflow_execution/constants.py Removes PROMPT_HOST and PROMPT_PORT string constants from ToolRuntimeVariable. Consistent with tools_utils.py changes.
.github/workflows/docker-tools-build-push.yaml Removes tool-structure from build options, promotes tool-sidecar as default, and reorders the build-config branches. All remaining options (tool-sidecar, tool-classifier, tool-text-extractor) have valid Dockerfile paths.
.github/workflows/production-build.yaml Removes prompt-service from the build matrix and decrements TOTAL_SERVICES from 7 to 6. The summary loop and service list are updated consistently.
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py Removes retry_prompt_service_call decorator and cleans up its docstring reference in is_retryable_error. All callers (prompt.py) were also removed.
docker/scripts/bump_sdk_v0_version.sh Removes prompt-service and structure tool from the version-bump script. Removes the update_structure_tool_version function and all references to STRUCTURE_DIR/PROMPT_SERVICE_DIR.
docker/docker-compose.yaml Removes prompt-service container definition and its depends_on entry from the backend service. Clean removal with no dangling references left.
workers/sample.env Removes PROMPT_HOST and PROMPT_PORT from both the active config section and the commented-out localhost overrides section.
tox.ini Removes the [testenv:prompt-service] section and updates the comment listing remaining service envs. Consistent with tests/groups.yaml cleanup.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph REMOVED["Removed (Phase 5)"]
        PS[prompt-service Flask app]
        TS[tools/structure container]
        SP[sdk1/prompt.py PromptTool]
        RD[retry_prompt_service_call decorator]
        ENV[PROMPT_HOST / PROMPT_PORT env vars]
    end

    subgraph PRESERVED["Preserved"]
        EX[workers/executor]
        STR[STRUCTURE_TOOL_IMAGE_* env vars]
        PIB[PromptIdeBaseTool]
        PLG[workers/plugins/]
    end

    subgraph CI["CI / Build"]
        PB[production-build.yaml matrix: 7→6 services]
        TB[docker-tools-build-push.yaml default: tool-sidecar]
    end

    PS -.->|replaced by| EX
    TS -.->|routing via| STR
    SP -.->|functionality in| EX
    ENV -.->|removed from| PB
    TB -.->|tool-structure option removed| TB
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    subgraph REMOVED["Removed (Phase 5)"]
        PS[prompt-service Flask app]
        TS[tools/structure container]
        SP[sdk1/prompt.py PromptTool]
        RD[retry_prompt_service_call decorator]
        ENV[PROMPT_HOST / PROMPT_PORT env vars]
    end

    subgraph PRESERVED["Preserved"]
        EX[workers/executor]
        STR[STRUCTURE_TOOL_IMAGE_* env vars]
        PIB[PromptIdeBaseTool]
        PLG[workers/plugins/]
    end

    subgraph CI["CI / Build"]
        PB[production-build.yaml matrix: 7→6 services]
        TB[docker-tools-build-push.yaml default: tool-sidecar]
    end

    PS -.->|replaced by| EX
    TS -.->|routing via| STR
    SP -.->|functionality in| EX
    ENV -.->|removed from| PB
    TB -.->|tool-structure option removed| TB
Loading

Reviews (11): Last reviewed commit: "Trigger CI re-run" | Re-trigger Greptile

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docker/scripts/uv-lock-gen/README.md`:
- Line 5: The example sentence describing "transitive dependency changes"
references the service "workers" which is not in the enumerated list; update
that sentence to reference an existing listed service (e.g., replace "workers"
with "runner") so the example matches the enumerated services, or alternatively
add "workers" into the enumerated list; locate the sentence containing
"transitive dependency changes" and the example "unstract/sdk1" and make the
replacement/addition accordingly to keep the README consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f274dae7-8d0e-4a6b-935a-8beec240f62e

📥 Commits

Reviewing files that changed from the base of the PR and between 0559057 and 7bdff5a.

⛔ Files ignored due to path filters (5)
  • prompt-service/src/unstract/prompt_service/tests/integration/input/sample1.pdf is excluded by !**/*.pdf
  • prompt-service/uv.lock is excluded by !**/*.lock
  • tools/classifier/src/config/icon.svg is excluded by !**/*.svg
  • tools/structure/src/config/icon.svg is excluded by !**/*.svg
  • tools/text_extractor/src/config/icon.svg is excluded by !**/*.svg
📒 Files selected for processing (114)
  • .github/workflows/docker-tools-build-push.yaml
  • .github/workflows/production-build.yaml
  • backend/backend/settings/base.py
  • backend/sample.env
  • docker/compose.debug.yaml
  • docker/docker-compose.build.yaml
  • docker/docker-compose.yaml
  • docker/dockerfiles/prompt.Dockerfile
  • docker/dockerfiles/prompt.Dockerfile.dockerignore
  • docker/sample.compose.override.yaml
  • docker/scripts/uv-lock-gen/README.md
  • docker/scripts/uv-lock-gen/uv-lock.sh
  • prompt-service/.gitignore
  • prompt-service/.python-version
  • prompt-service/README.md
  • prompt-service/entrypoint.sh
  • prompt-service/pyproject.toml
  • prompt-service/sample.env
  • prompt-service/src/unstract/prompt_service/__init__.py
  • prompt-service/src/unstract/prompt_service/config.py
  • prompt-service/src/unstract/prompt_service/constants.py
  • prompt-service/src/unstract/prompt_service/controllers/__init__.py
  • prompt-service/src/unstract/prompt_service/controllers/answer_prompt.py
  • prompt-service/src/unstract/prompt_service/controllers/extraction.py
  • prompt-service/src/unstract/prompt_service/controllers/health.py
  • prompt-service/src/unstract/prompt_service/controllers/indexing.py
  • prompt-service/src/unstract/prompt_service/core/index_v2.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/automerging.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/base_retriever.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/fusion.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/keyword_table.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/recursive.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/router.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/simple.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/subquestion.py
  • prompt-service/src/unstract/prompt_service/dto.py
  • prompt-service/src/unstract/prompt_service/exceptions.py
  • prompt-service/src/unstract/prompt_service/extensions.py
  • prompt-service/src/unstract/prompt_service/helpers/__init__.py
  • prompt-service/src/unstract/prompt_service/helpers/auth.py
  • prompt-service/src/unstract/prompt_service/helpers/postprocessor.py
  • prompt-service/src/unstract/prompt_service/helpers/prompt_ide_base_tool.py
  • prompt-service/src/unstract/prompt_service/helpers/usage.py
  • prompt-service/src/unstract/prompt_service/helpers/variable_replacement.py
  • prompt-service/src/unstract/prompt_service/run.py
  • prompt-service/src/unstract/prompt_service/services/__init__.py
  • prompt-service/src/unstract/prompt_service/services/answer_prompt.py
  • prompt-service/src/unstract/prompt_service/services/extraction.py
  • prompt-service/src/unstract/prompt_service/services/indexing.py
  • prompt-service/src/unstract/prompt_service/services/rentrolls_extractor/interface.py
  • prompt-service/src/unstract/prompt_service/services/retrieval.py
  • prompt-service/src/unstract/prompt_service/services/variable_replacement.py
  • prompt-service/src/unstract/prompt_service/tests/conftest.py
  • prompt-service/src/unstract/prompt_service/tests/integration/test_api_endpoints.py
  • prompt-service/src/unstract/prompt_service/tests/sample.env.test
  • prompt-service/src/unstract/prompt_service/tests/unit/__init__.py
  • prompt-service/src/unstract/prompt_service/tests/unit/conftest.py
  • prompt-service/src/unstract/prompt_service/tests/unit/test_retriever_llm.py
  • prompt-service/src/unstract/prompt_service/utils/__init__.py
  • prompt-service/src/unstract/prompt_service/utils/db_utils.py
  • prompt-service/src/unstract/prompt_service/utils/env_loader.py
  • prompt-service/src/unstract/prompt_service/utils/file_utils.py
  • prompt-service/src/unstract/prompt_service/utils/json_repair_helper.py
  • prompt-service/src/unstract/prompt_service/utils/log.py
  • prompt-service/src/unstract/prompt_service/utils/metrics.py
  • prompt-service/src/unstract/prompt_service/utils/request.py
  • tools/classifier/.dockerignore
  • tools/classifier/Dockerfile
  • tools/classifier/README.md
  • tools/classifier/__init__.py
  • tools/classifier/requirements.txt
  • tools/classifier/sample.env
  • tools/classifier/src/config/properties.json
  • tools/classifier/src/config/runtime_variables.json
  • tools/classifier/src/config/spec.json
  • tools/classifier/src/helper.py
  • tools/classifier/src/main.py
  • tools/structure/.dockerignore
  • tools/structure/.gitignore
  • tools/structure/Dockerfile
  • tools/structure/README.md
  • tools/structure/__init__.py
  • tools/structure/requirements.txt
  • tools/structure/sample.env
  • tools/structure/src/config/properties.json
  • tools/structure/src/config/runtime_variables.json
  • tools/structure/src/config/spec.json
  • tools/structure/src/constants.py
  • tools/structure/src/helpers.py
  • tools/structure/src/main.py
  • tools/structure/src/utils.py
  • tools/text_extractor/.dockerignore
  • tools/text_extractor/.gitignore
  • tools/text_extractor/Dockerfile
  • tools/text_extractor/README.md
  • tools/text_extractor/__init__.py
  • tools/text_extractor/requirements.txt
  • tools/text_extractor/sample.env
  • tools/text_extractor/src/config/properties.json
  • tools/text_extractor/src/config/runtime_variables.json
  • tools/text_extractor/src/config/spec.json
  • tools/text_extractor/src/example_package/__init__.py
  • tools/text_extractor/src/main.py
  • tools/text_extractor/tests/__init__.py
  • tox.ini
  • unstract/sdk1/src/unstract/sdk1/prompt.py
  • unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py
  • unstract/sdk1/tests/conftest.py
  • unstract/sdk1/tests/test_prompt.py
  • unstract/sdk1/tests/utils/test_retry_utils.py
  • unstract/workflow-execution/src/unstract/workflow_execution/constants.py
  • unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py
  • workers/sample.env
💤 Files with no reviewable changes (100)
  • tools/classifier/src/config/properties.json
  • tools/text_extractor/.gitignore
  • tools/classifier/.dockerignore
  • prompt-service/.gitignore
  • prompt-service/sample.env
  • docker/dockerfiles/prompt.Dockerfile.dockerignore
  • tools/structure/sample.env
  • tools/structure/src/config/properties.json
  • prompt-service/README.md
  • docker/dockerfiles/prompt.Dockerfile
  • tools/text_extractor/src/config/properties.json
  • tools/text_extractor/README.md
  • tools/structure/requirements.txt
  • prompt-service/src/unstract/prompt_service/tests/sample.env.test
  • tools/classifier/src/config/runtime_variables.json
  • prompt-service/entrypoint.sh
  • docker/scripts/uv-lock-gen/uv-lock.sh
  • tools/classifier/README.md
  • tools/classifier/sample.env
  • workers/sample.env
  • prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/base_retriever.py
  • prompt-service/src/unstract/prompt_service/utils/db_utils.py
  • tools/structure/src/constants.py
  • tools/structure/Dockerfile
  • tools/structure/README.md
  • prompt-service/src/unstract/prompt_service/services/rentrolls_extractor/interface.py
  • tools/text_extractor/src/main.py
  • prompt-service/src/unstract/prompt_service/controllers/health.py
  • tools/text_extractor/.dockerignore
  • tools/text_extractor/src/config/runtime_variables.json
  • tools/text_extractor/requirements.txt
  • prompt-service/src/unstract/prompt_service/services/indexing.py
  • tools/text_extractor/Dockerfile
  • prompt-service/src/unstract/prompt_service/tests/unit/test_retriever_llm.py
  • unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py
  • prompt-service/src/unstract/prompt_service/controllers/init.py
  • tools/classifier/src/config/spec.json
  • prompt-service/src/unstract/prompt_service/utils/file_utils.py
  • prompt-service/src/unstract/prompt_service/tests/integration/test_api_endpoints.py
  • prompt-service/src/unstract/prompt_service/utils/log.py
  • prompt-service/src/unstract/prompt_service/utils/metrics.py
  • prompt-service/src/unstract/prompt_service/extensions.py
  • unstract/sdk1/src/unstract/sdk1/prompt.py
  • unstract/workflow-execution/src/unstract/workflow_execution/constants.py
  • prompt-service/src/unstract/prompt_service/controllers/answer_prompt.py
  • tools/structure/src/config/runtime_variables.json
  • prompt-service/src/unstract/prompt_service/constants.py
  • prompt-service/src/unstract/prompt_service/tests/unit/conftest.py
  • tools/structure/.gitignore
  • prompt-service/src/unstract/prompt_service/controllers/indexing.py
  • tools/classifier/Dockerfile
  • prompt-service/src/unstract/prompt_service/services/variable_replacement.py
  • tools/structure/src/helpers.py
  • tools/structure/src/config/spec.json
  • prompt-service/src/unstract/prompt_service/helpers/auth.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/keyword_table.py
  • tools/text_extractor/src/config/spec.json
  • prompt-service/src/unstract/prompt_service/tests/conftest.py
  • unstract/sdk1/tests/utils/test_retry_utils.py
  • prompt-service/src/unstract/prompt_service/dto.py
  • prompt-service/.python-version
  • prompt-service/src/unstract/prompt_service/helpers/postprocessor.py
  • prompt-service/src/unstract/prompt_service/helpers/usage.py
  • docker/docker-compose.build.yaml
  • tools/classifier/src/main.py
  • backend/sample.env
  • prompt-service/src/unstract/prompt_service/exceptions.py
  • prompt-service/src/unstract/prompt_service/utils/json_repair_helper.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/automerging.py
  • unstract/sdk1/tests/conftest.py
  • prompt-service/src/unstract/prompt_service/core/index_v2.py
  • prompt-service/src/unstract/prompt_service/helpers/prompt_ide_base_tool.py
  • tools/text_extractor/sample.env
  • prompt-service/src/unstract/prompt_service/services/answer_prompt.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/recursive.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/subquestion.py
  • tools/structure/src/utils.py
  • prompt-service/src/unstract/prompt_service/utils/request.py
  • docker/compose.debug.yaml
  • prompt-service/src/unstract/prompt_service/services/extraction.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/router.py
  • prompt-service/src/unstract/prompt_service/config.py
  • tools/structure/.dockerignore
  • prompt-service/src/unstract/prompt_service/controllers/extraction.py
  • prompt-service/pyproject.toml
  • tools/classifier/src/helper.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/simple.py
  • tools/classifier/requirements.txt
  • docker/docker-compose.yaml
  • prompt-service/src/unstract/prompt_service/services/retrieval.py
  • prompt-service/src/unstract/prompt_service/run.py
  • prompt-service/src/unstract/prompt_service/helpers/variable_replacement.py
  • unstract/sdk1/tests/test_prompt.py
  • tools/structure/src/main.py
  • prompt-service/src/unstract/prompt_service/utils/env_loader.py
  • backend/backend/settings/base.py
  • prompt-service/src/unstract/prompt_service/core/retrievers/fusion.py
  • unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py
  • docker/sample.compose.override.yaml

Comment thread docker/scripts/uv-lock-gen/README.md
@harini-venkataraman harini-venkataraman changed the title [MISC] Phase 5: Decommission prompt-service, old tools, SDK1 prompt module [MISC] Decommission prompt-service, old tools, SDK1 prompt module May 20, 2026
pk-zipstack and others added 7 commits May 21, 2026 21:46
…1877)

* [FIX] Add hook for setting default adapters for invited users

Add setup_default_adapters_for_user() hook to AuthenticationService
and call it from set_user_organization() when an invited user joins
an existing organization. This allows the cloud plugin to set up
default triad adapters (LLM, embedding, vector DB, x2text) for
invited users, fixing silent failures in API deployment creation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Update backend/account_v2/authentication_controller.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: Praveen Kumar <praveen@zipstack.com>

* [FIX] Improve log message for setup_default_adapters_for_user

Address review comment: log user email and explain that default
adapters will not be set when the method is not implemented.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [MISC] Rename Default Triad to Default LLM Profile in UI

Update display label from "Default Triad" to "Default LLM Profile"
in the page heading and side navigation menu.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Signed-off-by: Praveen Kumar <praveen@zipstack.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Deepak K <89829542+Deepak-Kesavan@users.noreply.github.com>
* [FIX] Wrap set_user_organization in transaction.atomic

The new-org branch creates the org row, then calls frictionless onboarding
and the initial platform key. Failures mid-flow leave an orphan org with no
adapters or key, and subsequent logins skip onboarding entirely (gated on
new_organization). Atomic ensures the org rolls back on any failure so
retries get a clean fresh-org path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [MISC] Worktree skill — use --no-track to prevent accidental main pushes

Without --no-track, a later `git push -u origin <branch>` can be reported
by the server as also fast-forwarding main, landing commits on main.

* [FIX] Use logger.exception in authorization_callback

Preserves the traceback when the OAuth callback hits the safety-net
catch. Behaviour unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Athul <89829560+athul-rs@users.noreply.github.com>
Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com>
…1930)

* UN-3386 [FEAT] Add Prompt Studio HITL change indicator plugin slot

Wires up the host-side hooks for the prompt-change-indicator plugin
(implementation lives in unstract-cloud): a dynamic-import slot in
the prompt card Header for the indicator button, and a route at
:orgName/review/readonly/:documentId for the read-only audit view.
Both gates fall through gracefully when the plugin is absent (OSS).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* UN-3386 [FIX] Warn when ReadOnlyReviewPage loads without ReviewLayout

Addresses review feedback: the readonly route nests inside ReviewLayout
(manual-review plugin), so a deployment that ships prompt-change-indicator
without manual-review would silently fail to register the route. Log a
console.warn in that case to make the misconfiguration discoverable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* UN-3386 [FIX] Surface real plugin import errors in route loader

Bare catch in the prompt-change-indicator dynamic import was swallowing
syntax/runtime errors in the plugin file alongside the expected
"plugin missing in OSS" case. Detect the missing-module messages
explicitly and console.error anything else so a broken cloud plugin
no longer disables the readonly route silently.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Add OpenAI-compatible LLM adapter

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address review feedback for custom OpenAI adapter

* Fix import formatting after rebase

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address follow-up review comments for OpenAI-compatible adapter

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refine OpenAI compatible adapter schema naming

* Reject empty model string in OpenAICompatibleLLMParameters

validate_model previously produced "custom_openai/" for an empty model,
surfacing as a confusing LiteLLM error at call time. Match the existing
GeminiLLMParameters.validate_model pattern: strip whitespace, raise
ValueError on empty input.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Revert SCHEMA_PATH plumbing; rename schema to custom_openai.json

Addresses Ritwik's review feedback. The new BaseAdapter.SCHEMA_PATH
class variable and the conditional branch in get_json_schema() are
unnecessary: OpenAICompatibleLLMAdapter.get_provider() returns
"custom_openai", and the default path resolution already builds
…/llm1/static/{get_provider()}.json. Renaming the schema file lets
the default lookup find it and keeps the base class untouched, which
is the convention every other adapter follows.

- Rename openai_compatible.json -> custom_openai.json
- Drop SCHEMA_PATH class var and the if-None branch from BaseAdapter
- Drop SCHEMA_PATH override (and unused os/ClassVar imports) from
  OpenAICompatibleLLMAdapter
- Update test_openai_compatible_schema_is_loadable to read schema via
  get_json_schema() instead of touching SCHEMA_PATH directly

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Hari John Kuriakose <hari@zipstack.com>
Co-authored-by: Chandrasekharan M <chandrasekharan@zipstack.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Athul <athul@zipstack.com>
Co-authored-by: Athul <89829560+athul-rs@users.noreply.github.com>
Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com>
* [HOTFIX] Use importlib.util.find_spec for pluggable worker discovery (#1918)

* [FIX] Use importlib.util.find_spec for pluggable worker discovery

_verify_pluggable_worker_exists() previously checked for the literal file
`pluggable_worker/<name>/worker.py` on disk, which breaks when the plugin
has been compiled to a .so (Nuitka, Cython, or any C extension) — the
module is perfectly importable but the pre-check rejects it because only
the .py extension is considered.

Replace the filesystem check with importlib.util.find_spec(), which is
Python's standard way to ask "is this module resolvable by the import
system?". It honors every registered finder — source .py, compiled .so,
bytecode .pyc, namespace packages, zipimports — so the function now
matches what its docstring claims: verifying the module can be loaded,
not that a specific file extension is present.

Behavior is preserved for existing deployments:
- Images with no `pluggable_worker/<name>/` subpackage → find_spec
  raises ModuleNotFoundError (ImportError subclass) → returns False.
- Images with source .py → find_spec resolves the .py → returns True.
- Images with compiled .so → find_spec resolves the .so → returns True.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Handle ValueError from find_spec in pluggable worker verification

Greptile-flagged edge case: importlib.util.find_spec() can raise
ValueError (not just ImportError) when sys.modules has a partially
initialised module entry with __spec__ = None from a prior failed import.
Broaden the except to catch both.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Resolve api-deployment worker directory from enum import path

worker.py:452 did worker_type.value.replace("-", "_") to derive the
on-disk dir name. All WorkerType enum values already use underscores,
so the replace was a no-op; for API_DEPLOYMENT whose dir is
"api-deployment" (hyphen), it resolved to "api_deployment" and the
os.path.exists() check failed. Boot then logged a spurious
"❌ Worker directory not found: /app/api_deployment" at ERROR level.

The task registration path (builder + celery autodiscover via
to_import_path) is unaffected, so this was purely log noise — but
noise at ERROR level that masks real failures in log scans.

Fix: derive the directory from the authoritative to_import_path()
which already handles the hyphen case (api_deployment -> api-deployment).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [HOTFIX] Add IAM Role / Instance Profile auth mode to AWS Bedrock adapter (#1944)

* [FEAT] Allow Bedrock to fall through to boto3's default credential chain

Match the S3/MinIO connector pattern: when AWS access keys are left blank
on the Bedrock LLM and embedding adapter forms, drop them from the kwargs
dict so boto3's default credential chain handles authentication. This
unlocks IAM role / instance profile / IRSA / AWS Profile scenarios on
hosts that already have ambient AWS credentials (e.g. EKS workers with
IRSA, EC2 with an instance profile).

- llm1/static/bedrock.json: clarify access-key descriptions to mention
  IRSA and instance profile (already non-required at v0.163.2 base).
- embedding1/static/bedrock.json: drop aws_access_key_id and
  aws_secret_access_key from top-level required; same description fix;
  expose aws_profile_name for parity with the LLM form.
- base1.py: AWSBedrockLLMParameters and AWSBedrockEmbeddingParameters
  now strip empty access-key values from the validated kwargs before
  returning, so empty strings don't override boto3's default chain.
  AWSBedrockEmbeddingParameters fields gain explicit None defaults
  and an aws_profile_name field.

Backward-compatible: existing adapters with access keys filled in
continue to work unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FEAT] Add Authentication Type selector to Bedrock adapter form

Add an explicit `auth_type` selector with two options, making the auth
choice clear to users:

- "Access Keys" (default): existing flow, keys required
- "IAM Role / Instance Profile (on-prem AWS only)": no fields; relies on
  boto3's default credential chain (IRSA on EKS, task role on ECS,
  instance profile on EC2). Description on the selector explicitly notes
  this option is only for AWS-hosted Unstract deployments.

The form-only auth_type field is stripped before LiteLLM validation in
both AWSBedrockLLMParameters.validate() and AWSBedrockEmbeddingParameters.
validate(). Empty access keys continue to be stripped so boto3 falls
through to the default chain even when the access_keys arm is selected
without values (matches the S3/MinIO connector pattern).

Backward-compatible: legacy adapters without auth_type behave as
"Access Keys" mode (the default), and existing keys are forwarded
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [REVIEW] Address Bedrock auth_type review feedback

Fixes the P0/P1 issues raised by greptile-apps and jaseemjaskp on
PR #1944.

Behaviour fixes:
- Stale-key leak in IAM Role mode: switching an existing adapter from
  Access Keys to IAM Role would carry truthy stored access keys through
  the strip-empty-only loop, so boto3 silently authenticated with the
  old long-lived credentials instead of falling through to the host's
  IRSA / instance-profile identity. Both LLM and embedding paths were
  affected.
- Silent acceptance of unknown auth_type: a typo (e.g. "access_key") or
  a malformed payload from a non-UI client passed through the dict
  comprehension untouched, with no enum guard.
- Cross-field validation gap: explicit Access Keys mode with blank or
  whitespace-only values silently fell through to the default
  credential chain instead of surfacing the misconfiguration.

Implementation:
- Add a module-level _resolve_bedrock_aws_credentials helper used by
  both AWSBedrockLLMParameters.validate() and AWSBedrock
  EmbeddingParameters.validate(), so the auth-type contract is
  expressed once.
  - Validates auth_type against an allowlist (None | "access_keys" |
    "iam_role"); raises ValueError on anything else.
  - iam_role: unconditionally drops aws_access_key_id and
    aws_secret_access_key.
  - access_keys (explicit): requires non-blank values; raises ValueError
    if either is empty or whitespace-only.
  - Legacy (auth_type absent): retains the lenient strip behaviour so
    pre-PR adapter configurations continue to deserialise unchanged.
- Restore aws_region_name as required (no `= None` default) on
  AWSBedrockEmbeddingParameters; only credentials may legitimately be
  absent.
- Drop the orphan aws_profile_name field from
  embedding1/static/bedrock.json: it was added for parity with the LLM
  form but lives outside the auth_type oneOf and contradicts the
  selector's "no further input" semantics. The LLM form already had
  aws_profile_name pre-PR and is left alone for backwards compatibility.

Tests:
- New tests/test_bedrock_adapter.py covers 15 cases across LLM and
  embedding adapters: legacy-no-auth-type, explicit access_keys with
  valid/blank/whitespace keys, iam_role with stale/no keys, unknown
  auth_type rejection, cross-field validation, and preservation of
  unrelated params (model_id, aws_profile_name, region, thinking).

Skipped (P2 nice-to-have):
- Comment-scope clarification, MinIO reference rewording,
  validate-mutates-caller'\''s-dict, and the LLM form description nit
  about aws_profile_name visibility. These don'\''t change behaviour
  and can be addressed in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [HOTFIX] Bump litellm to 1.83.10 from PyPI to clear CVE-2026-42208 (#1976)

Hotfix for cloud v0.159.3 (OSS v0.163.4). Customer scanner flagged
litellm 1.82.3 for CVE-2026-42208 (SQL injection in litellm proxy auth
path, affects 1.81.16-1.83.6). We do not use litellm.proxy, but
vulnerability scanners flag the installed package regardless of which
code path is reachable.

Bump to 1.83.10 — the exact version recommended by the upstream advisory
(v1.83.10-stable) and the smallest jump that clears the CVE range while
keeping python-dotenv==1.0.1 compatible (1.83.14 would force bumping
python-dotenv across 7+ pyproject.toml files). Only tiktoken needed to
move 0.9 -> 0.12 to satisfy litellm's pin.

Switch source back to PyPI now that the PyPI quarantine is over,
reversing the temporary fork in #1873.

Cohere embed timeout patch: verified that
litellm/llms/cohere/embed/handler.py is byte-identical between v1.82.3,
v1.83.10-stable, and v1.83.14-stable (the timeout-not-forwarded bug
fixed in #1848 is still present upstream — BerriAI/litellm#14635 remains
OPEN). Version guard bumped 1.82.3 -> 1.83.10; 6/6 patch tests pass on
the new version, confirming the monkey-patch still binds correctly.

Other cleanup from #1873:
- Drop git apt-install from worker-unified and tool Dockerfiles (no
  git-sourced deps remain in any uv.lock)
- Bump tool versions: structure 0.0.100 -> 0.0.101,
  classifier 0.0.79 -> 0.0.80, text_extractor 0.0.75 -> 0.0.76

Note on root uv.lock churn: the v0.163.4 root uv.lock had a pre-existing
corruption (banks v2.4.1 entry pointing at banks-2.2.0 wheel) that
blocked incremental resolution. Regenerated from scratch.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Align cohere patch docstring with version-guard semantics

Reviewer flagged that the docstring claimed the patch is "confirmed in
every release between 1.82.3 and 1.83.14-stable", but the guard at
_PATCHED_LITELLM_VERSION activates only on the exact pinned version. A
future maintainer reading the old text could reasonably expect bumping
to e.g. 1.83.11 to keep the fix active; in reality it silently turns
off.

Rewritten to reference _PATCHED_LITELLM_VERSION as the single source of
truth and to drop the rot-prone "as of 2026-05-20" calendar date.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
The atomic wrap from #1954 uncommits the new org row when
frictionless_onboarding HTTP-calls the LLMW portal mid-transaction.
The portal runs on a separate DB session and under READ COMMITTED
cannot see the uncommitted row, so the call returns 400 and the
caller silently persists an adapter with an empty unstract_key.
Every new signup since 2026-05-19 09:47 UTC ships a broken
free-trial X2Text adapter (401 on first OCR).

Hotfix only — Phase 2 (UN-3476) restructures the function so the
atomic guarantee is reapplied around just the pure-DB writes, with
HTTP and non-DB side effects moved outside the transaction.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ion-old-components

# Conflicts:
#	prompt-service/uv.lock
#	tools/classifier/Dockerfile
#	tools/classifier/src/config/properties.json
#	tools/structure/Dockerfile
#	tools/structure/src/config/properties.json
#	tools/text_extractor/Dockerfile
#	tools/text_extractor/src/config/properties.json
Comment thread .github/workflows/docker-tools-build-push.yaml
Comment thread .github/workflows/production-build.yaml
The Phase 5 decommission commit removed classifier, structure,
text_extractor, and prompt-service. However, text_extractor is still
in active use by customers. This surgically restores only the
text_extractor tool while keeping the other decommissions in place.

- Restore tools/text_extractor/ directory (14 files from origin/main)
- Add tool-text_extractor back to docker-compose.build.yaml
- Add tool-text-extractor back to docker-tools-build-push.yaml workflow

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/docker-tools-build-push.yaml:
- Around line 59-65: Replace direct inline checks of ${{
github.event.inputs.service_name }} with a single env variable (e.g.,
SERVICE_NAME) and a case whitelist that sets GITHUB_OUTPUT keys (context and
dockerfile) for known services ("tool-sidecar", "tool-text-extractor") and
otherwise prints an error and exits non‑zero to fail closed; update the branch
that currently echoes "context" and "dockerfile" to use the values chosen in the
case for SERVICE_NAME, and ensure unknown values trigger an explicit exit 1 so
$GITHUB_OUTPUT is never left unset for invalid inputs.

In `@docker/docker-compose.build.yaml`:
- Around line 33-37: The docker-compose service key tool-text_extractor
currently has an image name with an underscore
(unstract/tool-text_extractor:${VERSION}) which mismatches other places
expecting the hyphenated repo name; update the image: value for the
tool-text_extractor service to unstract/tool-text-extractor:${VERSION} (keep the
service key tool-text_extractor unchanged) so the locally built image name
matches the registry and CI naming used by run-platform.sh,
.github/workflows/docker-tools-build-push.yaml and public_tools.json.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 615c3169-ef09-4106-b415-f69172a4e10a

📥 Commits

Reviewing files that changed from the base of the PR and between 0619756 and 093a6b4.

📒 Files selected for processing (2)
  • .github/workflows/docker-tools-build-push.yaml
  • docker/docker-compose.build.yaml

Comment thread .github/workflows/docker-tools-build-push.yaml
Comment thread docker/docker-compose.build.yaml
@jaseemjaskp jaseemjaskp self-requested a review May 27, 2026 04:54
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@jaseemjaskp jaseemjaskp left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review — Phase 5 decommission

Reviewed via the PR Review Toolkit (code review, silent-failure, comment/doc rot, test-coverage). Verdict: clean, internally-consistent decommission. All in-PR removals (PROMPT_HOST/PROMPT_PORT, retry_prompt_service_call, prompt-service/tools-structure build targets, tox env, compose blocks) are symmetric — definition, usages, and tests dropped together, no dangling references in changed files, no silent-failure risk, no real test-coverage gap (the deleted tests covered code that is also deleted or migrated to workers/ with equal-or-better coverage).

The findings below are residual references the decommission missed. Almost all live in files this PR does not touch, so they can't be anchored as inline comments — listing them here so the decommission is actually complete. (Cross-checked against existing CodeRabbit/maintainer comments — no duplicates.)

🔴 Functional — will break

  • docker/scripts/bump_sdk_v0_version.sh (lines 13, 17, 21, 24, 419–420): still defines PROMPT_SERVICE_DIR=.../prompt-service and STRUCTURE_DIR=.../tools/structure, includes PROMPT_SERVICE_DIR in SERVICE_DIRS, and calls reset_file "$PROMPT_SERVICE_DIR/pyproject.toml" / uv.lock. Both dirs are deleted by this PR, so the next SDK version bump will error / silently skip. Remove these entries.

🟡 Self-contradicting doc

  • docs/local-dev-setup-executor-migration.md:223-224: sample env still sets PROMPT_HOST=http://localhost / PROMPT_PORT=3003, while line 550 of the same doc has a checklist item "No dangling references to prompt-service, PromptTool, PROMPT_HOST, PROMPT_PORT." Remove lines 223–224.

🔵 Stale docs/comments (harmless but misleading)

  • unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py:280 — docstring still lists prompt-service as a retry target (this file is edited by the PR; see inline note).
  • unstract/sdk1/src/unstract/sdk1/llm.py:887 — class docstring references "The prompt-service's RetrieverLLM" (that class lived in the deleted prompt-service; it now lives in workers/executor/.../retrievers/retriever_llm.py).
  • docker/README.md:105 — debug-port table still has | prompt-service | 5681 | (the matching compose.debug.yaml entry was removed in this PR).
  • workers/executor/README.md:25 — service table still lists prompt-service.
  • workers/ARCHITECTURE.md:37 — cites prompt-service/ as an existing pattern to align with.
  • backend/migrating/v2/README.md:64 — instructs setting DB_SCHEMA in prompt-service .env.
  • unstract/core/src/unstract/core/plugins/README.md:88 — enumerates prompt-service as a current service.
  • workers/file_processing/structure_tool_task.py:42,197,675 — comments "mirrored from / replicates tools/structure/src/..." now point at deleted source; reword to mark this file as the canonical implementation.
  • docker/dockerfiles/*.Dockerfile.dockerignore (backend/frontend/runner/platform/tool-sidecar/x2text) still ignore prompt-service — dead no-op entries.

ℹ️ Optional test note

  • The deleted prompt-service .../tests/unit/test_retriever_llm.py::TestRetrieverLLM unit-tested RetrieverLLM method delegation + llama-index↔sdk1 type conversion. The migrated workers/.../retrievers/retriever_llm.py retains those methods, but workers/tests/test_retrieval.py only asserts isinstance(...) with __init__ mocked — the conversion/delegation logic is now unit-test-uncovered. Low severity; consider porting TestRetrieverLLM into workers/tests/.

✅ Confirmed consistent

production-build.yaml (matrix = 6, TOTAL_SERVICES=6, summary loop = same 6); sample.env STRUCTURE_TOOL bump 0.0.100→0.0.101 (URL + TAG both updated); tools_utils.py/constants.py/base.py/retry_utils.py removals fully symmetric.

Comment thread unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py
Resolve 3 conflicts:
- prompt-service/pyproject.toml: delete (decommissioned)
- prompt-service/uv.lock: delete (decommissioned)
- tox.ini: adopt main's test rig system, remove [testenv:prompt-service]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tox.ini`:
- Around line 2-76: The failing rig validation is caused by a stale group
"unit-prompt-service" in tests/groups.yaml that references a removed workdir;
open tests/groups.yaml, locate the group named unit-prompt-service and either
delete that entire group entry or add optional: true beneath it so the rig will
skip the missing path, then run python -m tests.rig validate (or tox -e unit) to
confirm the error is resolved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 77314b63-7216-4c05-b034-979d3698a0d6

📥 Commits

Reviewing files that changed from the base of the PR and between 13cfe5b and db834ba.

📒 Files selected for processing (4)
  • backend/backend/settings/base.py
  • backend/sample.env
  • tox.ini
  • workers/sample.env
💤 Files with no reviewable changes (1)
  • workers/sample.env
✅ Files skipped from review due to trivial changes (1)
  • backend/sample.env
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/backend/settings/base.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tox.ini`:
- Around line 2-76: The failing rig validation is caused by a stale group
"unit-prompt-service" in tests/groups.yaml that references a removed workdir;
open tests/groups.yaml, locate the group named unit-prompt-service and either
delete that entire group entry or add optional: true beneath it so the rig will
skip the missing path, then run python -m tests.rig validate (or tox -e unit) to
confirm the error is resolved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 77314b63-7216-4c05-b034-979d3698a0d6

📥 Commits

Reviewing files that changed from the base of the PR and between 13cfe5b and db834ba.

📒 Files selected for processing (4)
  • backend/backend/settings/base.py
  • backend/sample.env
  • tox.ini
  • workers/sample.env
💤 Files with no reviewable changes (1)
  • workers/sample.env
✅ Files skipped from review due to trivial changes (1)
  • backend/sample.env
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/backend/settings/base.py
🛑 Comments failed to post (1)
tox.ini (1)

2-76: ⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Verify groups.yaml handling of unit-prompt-service after prompt-service removal

# Check if unit-prompt-service is still defined in groups.yaml
echo "=== Checking for unit-prompt-service in groups.yaml ==="
rg -n "unit-prompt-service" tests/groups.yaml

# Check if the paths referenced by unit-prompt-service exist
echo -e "\n=== Checking if prompt-service test paths exist ==="
if [ -d "prompt-service/src/unstract/prompt_service/tests/unit" ]; then
  echo "❌ prompt-service test directory still exists"
else
  echo "✓ prompt-service test directory removed as expected"
fi

# Validate that tox -e unit would work after changes
echo -e "\n=== Validating test manifest ==="
python -m tests.rig validate 2>&1 | head -20

Repository: Zipstack/unstract

Length of output: 376


Fix stale unit-prompt-service group in tests/groups.yaml after prompt-service/ removal

  • tests/groups.yaml still defines unit-prompt-service: (~line 50), but the referenced prompt-service workdir is gone; python -m tests.rig validate errors: group 'unit-prompt-service': workdir does not exist: .../prompt-service.
  • Remove the unit-prompt-service group entirely or mark it optional: true so the rig skips it when paths are missing; this is necessary for tox -e unit to succeed.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tox.ini` around lines 2 - 76, The failing rig validation is caused by a stale
group "unit-prompt-service" in tests/groups.yaml that references a removed
workdir; open tests/groups.yaml, locate the group named unit-prompt-service and
either delete that entire group entry or add optional: true beneath it so the
rig will skip the missing path, then run python -m tests.rig validate (or tox -e
unit) to confirm the error is resolved.

harini-venkataraman and others added 2 commits June 12, 2026 14:42
The prompt-service directory was deleted in the decommission PR, but
the test rig groups.yaml still referenced it, causing CI to fail with
"workdir does not exist" during validate and integration steps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sonarqubecloud

Copy link
Copy Markdown

Resolve conflicts:
- docker/docker-compose.yaml: keep prompt-service removal (PR intent)
- prompt-service/uv.lock: delete (entire prompt-service is decommissioned)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
harini-venkataraman and others added 4 commits June 29, 2026 18:45
…Zipstack/unstract into feat/phase5-decommission-old-components
prompt-service/ and tools/structure/ are deleted by this PR, so
remove their variables, reset_file calls, and the entire
update_structure_tool_version function from bump_sdk_v0_version.sh.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix tool-text_extractor image name to tool-text-extractor in
  docker-compose.build.yaml to match CI, registry, and cloud naming
- Remove stale tool-structure from run-platform.sh ignore list
- Drop prompt-service from is_retryable_error docstring in retry_utils.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

Unstract test results

Per-group results

Status Group Tier Passed Failed Errors Skipped Duration (s)
unit-connectors unit 64 12 0 3 16.9
unit-core unit 0 0 4 0 1.2
unit-platform-service unit 9 0 1 0 1.4
unit-rig unit 53 0 0 0 3.4
unit-runner unit 11 0 0 0 3.1
unit-sdk1 unit 381 0 0 0 18.8
unit-tool-registry unit 0 0 1 0 1.3
unit-workers unit 0 0 0 0 18.0
TOTAL 518 12 6 3 64.1

Critical paths

⚠️ Critical paths not yet covered

  • auth-login — User can log in and obtain a session cookie. (entry: POST /api/v1/auth/login; declared coverage: no groups declared)
  • adapter-register-llm — Register and validate an LLM adapter. (entry: POST /api/v1/adapter/; declared coverage: no groups declared)
  • workflow-create-execute — Create a workflow, configure source+destination, execute, poll, fetch result. (entry: POST /api/v1/workflow/{id}/execute/; declared coverage: e2e-workflow)
  • api-deployment-run — Deploy a workflow as an API, POST a document, receive structured JSON. (entry: POST /deployment/api/{org}/{name}/; declared coverage: e2e-api-deployment)
  • prompt-studio-fetch-response — Prompt Studio: create project, add prompt, run single-pass, get response. (entry: POST /api/v1/prompt-studio/prompt-studio-tool/{id}/fetch_response/; declared coverage: e2e-prompt-studio)
  • pipeline-etl-execute — Run an ETL pipeline from source connector to destination. (entry: POST /api/v1/pipeline/{id}/execute/; declared coverage: no groups declared)
  • usage-token-tracking — Per-execution token usage is recorded and retrievable. (entry: GET /api/v1/usage/get_token_usage/; declared coverage: no groups declared)
  • workflow-execution-fan-out — Multi-file workflow execution fans out to file-processing workers and rejoins. (entry: internal: backend → rabbitmq → workers/file_processing; declared coverage: no groups declared)
  • callback-result-delivery — Async results are posted back via the callback worker. (entry: internal: workers/callback → backend /internal endpoints; declared coverage: no groups declared)
✅ Covered critical paths
  • tool-sandbox-exec — covered by unit-runner

@harini-venkataraman harini-venkataraman merged commit c830e10 into main Jun 29, 2026
10 checks passed
@harini-venkataraman harini-venkataraman deleted the feat/phase5-decommission-old-components branch June 29, 2026 14:16
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants