feat: Offload large llm response#234
Draft
andrii-novikov wants to merge 8 commits intodevelopmentfrom
Draft
Conversation
c7ad55b to
1cf1a29
Compare
Introduces CompletionResultProcessor extension point and LargeTextResponseProcessor as the first consumer, with accompanying read_file_lines / read_file_chars / search_in_file internal tools.
Drop content-type allow-list for v1. Any response whose content exceeds the size threshold is offloaded regardless of type; content-type-based routing is now explicit future work. Also renames LargeTextResponseProcessor to LargeResponseProcessor to reflect the broader scope.
- Rename module to completion_result_offload (env COMPLETION_RESULT_OFFLOAD__) - Move text-file tools to internal_tooling/text_file_tooling/ per project convention - Preview-gate the whole feature behind ENABLE_PREVIEW_FEATURES - Sort processors once in ToolExecutor constructor (not per execute call) - Add Mermaid diagrams: pipeline overview, algorithm flowchart, offload and read-back sequences - Add UC-5 for request-scoped DialFileService cache reuse - Note hard-limit/truncation/pagination alternatives under UC-3
Tools are tightly coupled to LargeResponseProcessor (they only exist to consume files it produces), so they live alongside the processor rather than under internal_tooling/ — keeps the whole feature in one place.
Component 4 still said tools were registered via InternalToolModule, which contradicted the decision to keep them co-located inside completion_result_offload/. Fix the wording to match.
1cf1a29 to
a880098
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Applicable issues
Description of changes
Checklist
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.