fix(antigravity): preserve finish_reason tool_calls across streaming chunks #874

MohammadErfan-Jabbari · 2026-01-05T17:46:53Z

Summary

Fixes streaming tool call detection for clients like Claude Code when using Gemini models via the antigravity translator.

The Problem:
When streaming responses with tool calls, the finish_reason was being incorrectly overwritten:

Chunk 1 contains functionCall → should set finish_reason: "tool_calls"
Chunk 2 contains finishReason: "STOP" + usage → overwrites to finish_reason: "stop" ❌

Clients like Claude Code see the final "stop" and think the conversation ended normally, breaking the tool call flow.

Root Cause:
The hasFunctionCall variable was local to each chunk processing - no memory across chunks. When chunk 2 has finishReason: "STOP" but no functionCall, it overwrites the previous correct value.

The Fix:

Add SawToolCall bool to track tool calls across the entire stream
Add UpstreamFinishReason string to cache the finish reason
Only emit finish_reason on the final chunk (detected by presence of both finishReason AND usageMetadata)
Apply priority: tool_calls > max_tokens > stop

Changes

internal/translator/antigravity/openai/chat-completions/antigravity_openai_response.go: State tracking and final chunk logic
internal/translator/antigravity/openai/chat-completions/antigravity_openai_response_test.go: 5 unit tests (new file)

Test Plan

Unit tests pass (go test ./internal/translator/antigravity/openai/chat-completions/...)
Manual testing with Claude Code + Gemini models via local server
Verified tool calls complete successfully
Verified normal text responses still get finish_reason: "stop"
Verified MAX_TOKENS responses get finish_reason: "max_tokens"

Test Results

Scenario	Expected	Actual
Tool call stream	`"tool_calls"`	✅ `"tool_calls"`
Normal text stream	`"stop"`	✅ `"stop"`
MAX_TOKENS	`"max_tokens"`	✅ `"max_tokens"`
Intermediate chunks	`null`	✅ `null`

Related Issues

This addresses the streaming tool call issues reported when using Claude Code with Gemini models through CLIProxyAPI.

🤖 Generated with Claude Code

…chunks When streaming responses with tool calls, the finish_reason was being overwritten. The upstream sends functionCall in chunk 1, then finishReason: STOP in chunk 2. The old code would set finish_reason from every chunk, causing "tool_calls" to be overwritten by "stop". This broke clients like Claude Code that rely on finish_reason to detect when tool calls are complete. Changes: - Add SawToolCall bool to track tool calls across entire stream - Add UpstreamFinishReason to cache the finish reason - Only emit finish_reason on final chunk (has both finishReason + usage) - Priority: tool_calls > max_tokens > stop Includes 5 unit tests covering: - Tool calls not overwritten by subsequent STOP - Normal text gets "stop" - MAX_TOKENS without tool calls gets "max_tokens" - Tool calls take priority over MAX_TOKENS - Intermediate chunks have no finish_reason Fixes streaming tool call detection for Claude Code + Gemini models. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

gemini-code-assist · 2026-01-05T17:47:12Z

Summary of Changes

Hello @MohammadErfan-Jabbari, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical bug in the antigravity translator that led to incorrect finish_reason reporting during streaming responses, particularly impacting tool call detection for clients like Claude Code when using Gemini models. The solution involves enhancing state management across streaming chunks and refining the logic for determining and emitting the final finish_reason with a defined priority, ensuring robust and accurate API translation.

Highlights

Corrected Finish Reason Logic: The pull request fixes an issue where the finish_reason was incorrectly overwritten during streaming responses, especially for tool calls, causing clients like Claude Code to misinterpret the stream's conclusion.
State Tracking Across Chunks: Introduced SawToolCall and UpstreamFinishReason variables to convertCliResponseToOpenAIChatParams to maintain state across streaming chunks, ensuring accurate finish_reason determination.
Delayed Finish Reason Emission: The finish_reason is now only emitted on the final streaming chunk (identified by the presence of both finishReason and usageMetadata), preventing premature or incorrect reporting.
Finish Reason Priority: Implemented a priority system for finish_reason: tool_calls takes precedence over max_tokens, which in turn takes precedence over stop.
Comprehensive Unit Tests: Added a new test file with 5 unit tests to cover various scenarios, including tool call preservation, normal text, max tokens, priority handling, and intermediate chunk behavior.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request correctly fixes a bug where the finish_reason for tool calls was being overwritten in streaming responses. The approach of using state variables (SawToolCall, UpstreamFinishReason) to persist information across chunks is sound. The logic to determine the final finish_reason is only executed on the final chunk and correctly prioritizes tool_calls. The addition of comprehensive unit tests is excellent and thoroughly covers the fix and related edge cases. I've included a couple of suggestions to enhance code readability and maintainability.

gemini-code-assist · 2026-01-05T17:48:33Z

internal/translator/antigravity/openai/chat-completions/antigravity_openai_response.go

 	if finishReasonResult := gjson.GetBytes(rawJSON, "response.candidates.0.finishReason"); finishReasonResult.Exists() {
-		template, _ = sjson.Set(template, "choices.0.finish_reason", strings.ToLower(finishReasonResult.String()))
-		template, _ = sjson.Set(template, "choices.0.native_finish_reason", strings.ToLower(finishReasonResult.String()))
+		(*param).(*convertCliResponseToOpenAIChatParams).UpstreamFinishReason = strings.ToUpper(finishReasonResult.String())
 	}


The repeated type assertion (*param).(*convertCliResponseToOpenAIChatParams) is used in several places (here, line 141, etc.), which can make the code harder to read. You've already extracted this into a params variable on line 194. It would be cleaner to perform the type assertion once at the beginning of the function, right after the *param nil check, and reuse the resulting variable throughout the function. This would improve readability and reduce redundancy.

gemini-code-assist · 2026-01-05T17:48:33Z

internal/translator/antigravity/openai/chat-completions/antigravity_openai_response_test.go

+func TestFinishReasonToolCallsNotOverwritten(t *testing.T) {
+	ctx := context.Background()
+	var param any
+
+	// Chunk 1: Contains functionCall - should set SawToolCall = true
+	chunk1 := []byte(`{"response":{"candidates":[{"content":{"parts":[{"functionCall":{"name":"list_files","args":{"path":"."}}}]}}]}}`)
+	result1 := ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk1, &param)
+
+	// Verify chunk1 has no finish_reason (null)
+	if len(result1) != 1 {
+		t.Fatalf("Expected 1 result from chunk1, got %d", len(result1))
+	}
+	fr1 := gjson.Get(result1[0], "choices.0.finish_reason")
+	if fr1.Exists() && fr1.String() != "" && fr1.Type.String() != "Null" {
+		t.Errorf("Expected finish_reason to be null in chunk1, got: %v", fr1.String())
+	}
+
+	// Chunk 2: Contains finishReason STOP + usage (final chunk, no functionCall)
+	// This simulates what the upstream sends AFTER the tool call chunk
+	chunk2 := []byte(`{"response":{"candidates":[{"finishReason":"STOP"}],"usageMetadata":{"promptTokenCount":10,"candidatesTokenCount":20,"totalTokenCount":30}}}`)
+	result2 := ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk2, &param)
+
+	// Verify chunk2 has finish_reason: "tool_calls" (not "stop")
+	if len(result2) != 1 {
+		t.Fatalf("Expected 1 result from chunk2, got %d", len(result2))
+	}
+	fr2 := gjson.Get(result2[0], "choices.0.finish_reason").String()
+	if fr2 != "tool_calls" {
+		t.Errorf("Expected finish_reason 'tool_calls', got: %s", fr2)
+	}
+
+	// Verify native_finish_reason is lowercase upstream value
+	nfr2 := gjson.Get(result2[0], "choices.0.native_finish_reason").String()
+	if nfr2 != "stop" {
+		t.Errorf("Expected native_finish_reason 'stop', got: %s", nfr2)
+	}
+}
+
+func TestFinishReasonStopForNormalText(t *testing.T) {
+	ctx := context.Background()
+	var param any
+
+	// Chunk 1: Text content only
+	chunk1 := []byte(`{"response":{"candidates":[{"content":{"parts":[{"text":"Hello world"}]}}]}}`)
+	ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk1, &param)
+
+	// Chunk 2: Final chunk with STOP
+	chunk2 := []byte(`{"response":{"candidates":[{"finishReason":"STOP"}],"usageMetadata":{"promptTokenCount":10,"candidatesTokenCount":5,"totalTokenCount":15}}}`)
+	result2 := ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk2, &param)
+
+	// Verify finish_reason is "stop" (no tool calls were made)
+	fr := gjson.Get(result2[0], "choices.0.finish_reason").String()
+	if fr != "stop" {
+		t.Errorf("Expected finish_reason 'stop', got: %s", fr)
+	}
+}
+
+func TestFinishReasonMaxTokens(t *testing.T) {
+	ctx := context.Background()
+	var param any
+
+	// Chunk 1: Text content
+	chunk1 := []byte(`{"response":{"candidates":[{"content":{"parts":[{"text":"Hello"}]}}]}}`)
+	ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk1, &param)
+
+	// Chunk 2: Final chunk with MAX_TOKENS
+	chunk2 := []byte(`{"response":{"candidates":[{"finishReason":"MAX_TOKENS"}],"usageMetadata":{"promptTokenCount":10,"candidatesTokenCount":100,"totalTokenCount":110}}}`)
+	result2 := ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk2, &param)
+
+	// Verify finish_reason is "max_tokens"
+	fr := gjson.Get(result2[0], "choices.0.finish_reason").String()
+	if fr != "max_tokens" {
+		t.Errorf("Expected finish_reason 'max_tokens', got: %s", fr)
+	}
+}
+
+func TestToolCallTakesPriorityOverMaxTokens(t *testing.T) {
+	ctx := context.Background()
+	var param any
+
+	// Chunk 1: Contains functionCall
+	chunk1 := []byte(`{"response":{"candidates":[{"content":{"parts":[{"functionCall":{"name":"test","args":{}}}]}}]}}`)
+	ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk1, &param)
+
+	// Chunk 2: Final chunk with MAX_TOKENS (but we had a tool call, so tool_calls should win)
+	chunk2 := []byte(`{"response":{"candidates":[{"finishReason":"MAX_TOKENS"}],"usageMetadata":{"promptTokenCount":10,"candidatesTokenCount":100,"totalTokenCount":110}}}`)
+	result2 := ConvertAntigravityResponseToOpenAI(ctx, "model", nil, nil, chunk2, &param)
+
+	// Verify finish_reason is "tool_calls" (takes priority over max_tokens)
+	fr := gjson.Get(result2[0], "choices.0.finish_reason").String()
+	if fr != "tool_calls" {
+		t.Errorf("Expected finish_reason 'tool_calls', got: %s", fr)
+	}
+}


These tests are great and cover all the necessary scenarios. To improve maintainability and reduce code duplication, consider refactoring them into a single table-driven test. Each entry in the table could represent a test case with a name, a sequence of input chunks, and the expected final finish_reason and native_finish_reason. This would make the test file more concise and adding new test cases in the future would be easier.

…chunks Cherry-picked from upstream PR router-for-me#874 Fixes incorrect finish_reason reporting during streaming for tool calls with Gemini

合并内容: - router-for-me#874: 修复 antigravity 流式响应中 finish_reason tool_calls 被覆盖的问题 - feat/usage-statistics-persistence: 新增 Kiro、GitHub Copilot 支持及用量持久化功能额外修复: - 补充 auth_files.go 缺失的 'encoding/hex' 导入 - Dockerfile 预创建 usage_stats.json 并设置写入权限 - docker-compose.yml 添加 usage_stats.json 卷挂载 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

gemini-code-assist bot reviewed Jan 5, 2026

View reviewed changes

MohammadErfan-Jabbari mentioned this pull request Jan 5, 2026

fix(antigravity): Streaming finish_reason 'tool_calls' overwritten by 'stop' - breaks Claude Code tool detection #876

Open

rv192 mentioned this pull request Jan 18, 2026

反重力2api无法使用工具 #1080

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(antigravity): preserve finish_reason tool_calls across streaming chunks #874

fix(antigravity): preserve finish_reason tool_calls across streaming chunks #874

MohammadErfan-Jabbari commented Jan 5, 2026

Uh oh!

gemini-code-assist bot commented Jan 5, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Uh oh!

gemini-code-assist bot Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

fix(antigravity): preserve finish_reason tool_calls across streaming chunks #874

Are you sure you want to change the base?

fix(antigravity): preserve finish_reason tool_calls across streaming chunks #874

Conversation

MohammadErfan-Jabbari commented Jan 5, 2026

Summary

Changes

Test Plan

Test Results

Related Issues

Uh oh!

gemini-code-assist bot commented Jan 5, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant