-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
fix(antigravity): preserve finish_reason tool_calls across streaming chunks #874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…chunks When streaming responses with tool calls, the finish_reason was being overwritten. The upstream sends functionCall in chunk 1, then finishReason: STOP in chunk 2. The old code would set finish_reason from every chunk, causing "tool_calls" to be overwritten by "stop". This broke clients like Claude Code that rely on finish_reason to detect when tool calls are complete. Changes: - Add SawToolCall bool to track tool calls across entire stream - Add UpstreamFinishReason to cache the finish reason - Only emit finish_reason on final chunk (has both finishReason + usage) - Priority: tool_calls > max_tokens > stop Includes 5 unit tests covering: - Tool calls not overwritten by subsequent STOP - Normal text gets "stop" - MAX_TOKENS without tool calls gets "max_tokens" - Tool calls take priority over MAX_TOKENS - Intermediate chunks have no finish_reason Fixes streaming tool call detection for Claude Code + Gemini models. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Summary of ChangesHello @MohammadErfan-Jabbari, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical bug in the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly fixes a bug where the finish_reason for tool calls was being overwritten in streaming responses. The approach of using state variables (SawToolCall, UpstreamFinishReason) to persist information across chunks is sound. The logic to determine the final finish_reason is only executed on the final chunk and correctly prioritizes tool_calls. The addition of comprehensive unit tests is excellent and thoroughly covers the fix and related edge cases. I've included a couple of suggestions to enhance code readability and maintainability.
| if finishReasonResult := gjson.GetBytes(rawJSON, "response.candidates.0.finishReason"); finishReasonResult.Exists() { | ||
| template, _ = sjson.Set(template, "choices.0.finish_reason", strings.ToLower(finishReasonResult.String())) | ||
| template, _ = sjson.Set(template, "choices.0.native_finish_reason", strings.ToLower(finishReasonResult.String())) | ||
| (*param).(*convertCliResponseToOpenAIChatParams).UpstreamFinishReason = strings.ToUpper(finishReasonResult.String()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The repeated type assertion (*param).(*convertCliResponseToOpenAIChatParams) is used in several places (here, line 141, etc.), which can make the code harder to read. You've already extracted this into a params variable on line 194. It would be cleaner to perform the type assertion once at the beginning of the function, right after the *param nil check, and reuse the resulting variable throughout the function. This would improve readability and reduce redundancy.
| func TestFinishReasonToolCallsNotOverwritten(t *testing.T) { | ||
| ctx := context.Background() | ||
| var param any | ||
|
|
||
| // Chunk 1: Contains functionCall - should set SawToolCall = true | ||
| chunk1 := []byte(`{"response":{"candidates":[{"content":{"parts":[{"functionCall":{"name":"list_files","args":{"path":"."}}}]}}]}}`) | ||
| result1 := ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk1, ¶m) | ||
|
|
||
| // Verify chunk1 has no finish_reason (null) | ||
| if len(result1) != 1 { | ||
| t.Fatalf("Expected 1 result from chunk1, got %d", len(result1)) | ||
| } | ||
| fr1 := gjson.Get(result1[0], "choices.0.finish_reason") | ||
| if fr1.Exists() && fr1.String() != "" && fr1.Type.String() != "Null" { | ||
| t.Errorf("Expected finish_reason to be null in chunk1, got: %v", fr1.String()) | ||
| } | ||
|
|
||
| // Chunk 2: Contains finishReason STOP + usage (final chunk, no functionCall) | ||
| // This simulates what the upstream sends AFTER the tool call chunk | ||
| chunk2 := []byte(`{"response":{"candidates":[{"finishReason":"STOP"}],"usageMetadata":{"promptTokenCount":10,"candidatesTokenCount":20,"totalTokenCount":30}}}`) | ||
| result2 := ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk2, ¶m) | ||
|
|
||
| // Verify chunk2 has finish_reason: "tool_calls" (not "stop") | ||
| if len(result2) != 1 { | ||
| t.Fatalf("Expected 1 result from chunk2, got %d", len(result2)) | ||
| } | ||
| fr2 := gjson.Get(result2[0], "choices.0.finish_reason").String() | ||
| if fr2 != "tool_calls" { | ||
| t.Errorf("Expected finish_reason 'tool_calls', got: %s", fr2) | ||
| } | ||
|
|
||
| // Verify native_finish_reason is lowercase upstream value | ||
| nfr2 := gjson.Get(result2[0], "choices.0.native_finish_reason").String() | ||
| if nfr2 != "stop" { | ||
| t.Errorf("Expected native_finish_reason 'stop', got: %s", nfr2) | ||
| } | ||
| } | ||
|
|
||
| func TestFinishReasonStopForNormalText(t *testing.T) { | ||
| ctx := context.Background() | ||
| var param any | ||
|
|
||
| // Chunk 1: Text content only | ||
| chunk1 := []byte(`{"response":{"candidates":[{"content":{"parts":[{"text":"Hello world"}]}}]}}`) | ||
| ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk1, ¶m) | ||
|
|
||
| // Chunk 2: Final chunk with STOP | ||
| chunk2 := []byte(`{"response":{"candidates":[{"finishReason":"STOP"}],"usageMetadata":{"promptTokenCount":10,"candidatesTokenCount":5,"totalTokenCount":15}}}`) | ||
| result2 := ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk2, ¶m) | ||
|
|
||
| // Verify finish_reason is "stop" (no tool calls were made) | ||
| fr := gjson.Get(result2[0], "choices.0.finish_reason").String() | ||
| if fr != "stop" { | ||
| t.Errorf("Expected finish_reason 'stop', got: %s", fr) | ||
| } | ||
| } | ||
|
|
||
| func TestFinishReasonMaxTokens(t *testing.T) { | ||
| ctx := context.Background() | ||
| var param any | ||
|
|
||
| // Chunk 1: Text content | ||
| chunk1 := []byte(`{"response":{"candidates":[{"content":{"parts":[{"text":"Hello"}]}}]}}`) | ||
| ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk1, ¶m) | ||
|
|
||
| // Chunk 2: Final chunk with MAX_TOKENS | ||
| chunk2 := []byte(`{"response":{"candidates":[{"finishReason":"MAX_TOKENS"}],"usageMetadata":{"promptTokenCount":10,"candidatesTokenCount":100,"totalTokenCount":110}}}`) | ||
| result2 := ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk2, ¶m) | ||
|
|
||
| // Verify finish_reason is "max_tokens" | ||
| fr := gjson.Get(result2[0], "choices.0.finish_reason").String() | ||
| if fr != "max_tokens" { | ||
| t.Errorf("Expected finish_reason 'max_tokens', got: %s", fr) | ||
| } | ||
| } | ||
|
|
||
| func TestToolCallTakesPriorityOverMaxTokens(t *testing.T) { | ||
| ctx := context.Background() | ||
| var param any | ||
|
|
||
| // Chunk 1: Contains functionCall | ||
| chunk1 := []byte(`{"response":{"candidates":[{"content":{"parts":[{"functionCall":{"name":"test","args":{}}}]}}]}}`) | ||
| ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk1, ¶m) | ||
|
|
||
| // Chunk 2: Final chunk with MAX_TOKENS (but we had a tool call, so tool_calls should win) | ||
| chunk2 := []byte(`{"response":{"candidates":[{"finishReason":"MAX_TOKENS"}],"usageMetadata":{"promptTokenCount":10,"candidatesTokenCount":100,"totalTokenCount":110}}}`) | ||
| result2 := ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk2, ¶m) | ||
|
|
||
| // Verify finish_reason is "tool_calls" (takes priority over max_tokens) | ||
| fr := gjson.Get(result2[0], "choices.0.finish_reason").String() | ||
| if fr != "tool_calls" { | ||
| t.Errorf("Expected finish_reason 'tool_calls', got: %s", fr) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are great and cover all the necessary scenarios. To improve maintainability and reduce code duplication, consider refactoring them into a single table-driven test. Each entry in the table could represent a test case with a name, a sequence of input chunks, and the expected final finish_reason and native_finish_reason. This would make the test file more concise and adding new test cases in the future would be easier.
…chunks Cherry-picked from upstream PR router-for-me#874 Fixes incorrect finish_reason reporting during streaming for tool calls with Gemini
合并内容: - router-for-me#874: 修复 antigravity 流式响应中 finish_reason tool_calls 被覆盖的问题 - feat/usage-statistics-persistence: 新增 Kiro、GitHub Copilot 支持及用量持久化功能 额外修复: - 补充 auth_files.go 缺失的 'encoding/hex' 导入 - Dockerfile 预创建 usage_stats.json 并设置写入权限 - docker-compose.yml 添加 usage_stats.json 卷挂载 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Summary
Fixes streaming tool call detection for clients like Claude Code when using Gemini models via the antigravity translator.
The Problem:
When streaming responses with tool calls, the
finish_reasonwas being incorrectly overwritten:functionCall→ should setfinish_reason: "tool_calls"finishReason: "STOP"+ usage → overwrites tofinish_reason: "stop"❌Clients like Claude Code see the final
"stop"and think the conversation ended normally, breaking the tool call flow.Root Cause:
The
hasFunctionCallvariable was local to each chunk processing - no memory across chunks. When chunk 2 hasfinishReason: "STOP"but nofunctionCall, it overwrites the previous correct value.The Fix:
SawToolCall boolto track tool calls across the entire streamUpstreamFinishReason stringto cache the finish reasonfinish_reasonon the final chunk (detected by presence of bothfinishReasonANDusageMetadata)tool_calls>max_tokens>stopChanges
internal/translator/antigravity/openai/chat-completions/antigravity_openai_response.go: State tracking and final chunk logicinternal/translator/antigravity/openai/chat-completions/antigravity_openai_response_test.go: 5 unit tests (new file)Test Plan
go test ./internal/translator/antigravity/openai/chat-completions/...)finish_reason: "stop"MAX_TOKENSresponses getfinish_reason: "max_tokens"Test Results
"tool_calls""tool_calls""stop""stop""max_tokens""max_tokens"nullnullRelated Issues
This addresses the streaming tool call issues reported when using Claude Code with Gemini models through CLIProxyAPI.
🤖 Generated with Claude Code