Skip to content

feat: Offload large llm response#234

Draft
andrii-novikov wants to merge 8 commits intodevelopmentfrom
feat/large-llm-response
Draft

feat: Offload large llm response#234
andrii-novikov wants to merge 8 commits intodevelopmentfrom
feat/large-llm-response

Conversation

@andrii-novikov
Copy link
Copy Markdown
Contributor

Applicable issues

  • fixes #

Description of changes

Checklist

  • Title of the pull request follows Conventional Commits specification
  • Design documented is updated/created and approved by the team (if applicable)
  • Documentation is updated/created (if applicable)
  • Changes are tested on review environment
  • App schema changes are backward compatible, or breaking changes are documented with a migration guide
  • Integration tests pass

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@andrii-novikov andrii-novikov force-pushed the feat/large-llm-response branch from c7ad55b to 1cf1a29 Compare April 21, 2026 07:48
Introduces CompletionResultProcessor extension point and
LargeTextResponseProcessor as the first consumer, with accompanying
read_file_lines / read_file_chars / search_in_file internal tools.
Drop content-type allow-list for v1. Any response whose content exceeds
the size threshold is offloaded regardless of type; content-type-based
routing is now explicit future work.

Also renames LargeTextResponseProcessor to LargeResponseProcessor to
reflect the broader scope.
- Rename module to completion_result_offload (env COMPLETION_RESULT_OFFLOAD__)
- Move text-file tools to internal_tooling/text_file_tooling/ per project convention
- Preview-gate the whole feature behind ENABLE_PREVIEW_FEATURES
- Sort processors once in ToolExecutor constructor (not per execute call)
- Add Mermaid diagrams: pipeline overview, algorithm flowchart, offload and read-back sequences
- Add UC-5 for request-scoped DialFileService cache reuse
- Note hard-limit/truncation/pagination alternatives under UC-3
Tools are tightly coupled to LargeResponseProcessor (they only exist to
consume files it produces), so they live alongside the processor rather
than under internal_tooling/ — keeps the whole feature in one place.
Component 4 still said tools were registered via InternalToolModule,
which contradicted the decision to keep them co-located inside
completion_result_offload/. Fix the wording to match.


- Replace all CompletionResult* with ToolCallResult* to match code
- Move text-file tools to separate text_file_tooling module
- Drop read_file_chars (lines-only reading; char offsets unusable for LLMs)
- Add Design Decisions section: two-tool rationale (grep + read_file_lines)
@andrii-novikov andrii-novikov force-pushed the feat/large-llm-response branch from 1cf1a29 to a880098 Compare April 21, 2026 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant