Skip to content

fix: tool call indexes in chat completions stream#1982

Merged
akshaydeo merged 1 commit intomainfrom
03-07-fix_tool_call_indexes_in_chat_completions_stream
Mar 12, 2026
Merged

fix: tool call indexes in chat completions stream#1982
akshaydeo merged 1 commit intomainfrom
03-07-fix_tool_call_indexes_in_chat_completions_stream

Conversation

@TejasGhatte
Copy link
Collaborator

@TejasGhatte TejasGhatte commented Mar 7, 2026

Summary

Fixed streaming tool call indices for multiple parallel tool calls in chat completions stream. Previously, tool call indices were not properly sequenced (0, 1, 2, ...) when multiple tools were called in parallel during streaming responses.

fixes #1961

Changes

  • Added stream state tracking for tool call indices across all providers (Anthropic, Bedrock, Gemini, Cohere)
  • Implemented AnthropicStreamState, BedrockStreamState, and GeminiStreamState to maintain sequential tool call indexing
  • Updated streaming response handlers to assign proper sequential indices (0, 1, 2, ...) for multiple parallel tool calls
  • Added comprehensive test coverage for streaming multiple tool calls in both Chat Completions and Responses APIs
  • Enhanced test utilities with streaming tool call accumulator and validation functions

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

Run the multiple tool calls streaming tests to validate proper index sequencing:

# Core/Transports
go version
go test ./...

# Specifically test multiple tool calls streaming
go test -v ./core/internal/llmtests -run TestMultipleToolCallsStreaming

The tests validate that:

  • Tool call indices are properly sequenced (0, 1, 2, ...)
  • Multiple tools (weather, calculate) are called in parallel
  • Streaming responses maintain correct tool call metadata

Screenshots/Recordings

N/A - Backend streaming API fix

Breaking changes

  • Yes
  • No

Related issues

Fixes streaming tool call index sequencing for multiple parallel tool calls

Security considerations

No security implications - internal streaming response formatting fix

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

Copy link
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2026

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #1982

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 7, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 76f39600-19af-4c92-864b-4b4a2ca49268

📥 Commits

Reviewing files that changed from the base of the PR and between 42806b1 and 9d2b76c.

📒 Files selected for processing (14)
  • core/changelog.md
  • core/internal/llmtests/multiple_tool_calls.go
  • core/internal/llmtests/tool_calls_streaming.go
  • core/internal/llmtests/utils.go
  • core/providers/anthropic/anthropic.go
  • core/providers/anthropic/chat.go
  • core/providers/bedrock/bedrock.go
  • core/providers/bedrock/chat.go
  • core/providers/cohere/chat.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/gemini_test.go
  • core/providers/xai/xai_test.go
  • transports/changelog.md

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Fixed incorrect indexing of multiple tool calls when they execute in parallel during streaming chat completions, ensuring proper ordering and identification.
  • Tests

    • Expanded test coverage for parallel tool calling scenarios in streaming to validate correct indices, sequencing, and field completeness across supported providers.

Walkthrough

Adds per-stream state objects and threads them into provider streaming paths so tool-call indices are assigned and resolved consistently across streaming chunks; updates provider signatures, test suites, and validation utilities to verify deterministic, incrementing tool-call indices for parallel tool calls.

Changes

Cohort / File(s) Summary
Changelog Entries
core/changelog.md, transports/changelog.md
Add fix entry: streaming tool call indices for multiple parallel tool calls.
Streaming tests & utils
core/internal/llmtests/multiple_tool_calls.go, core/internal/llmtests/tool_calls_streaming.go, core/internal/llmtests/utils.go
Add streaming subtests for multiple tool calls, accumulate/validate streamed tool calls with retries/timeouts, make final ordering deterministic (sorted keys), change validateStreamingToolCalls to return error, and add Index int to ToolCallInfo.
Anthropic provider
core/providers/anthropic/chat.go, core/providers/anthropic/anthropic.go
Add AnthropicStreamState and NewAnthropicStreamState(), update ToBifrostChatCompletionStream signature to accept state and propagate it into chat/responses streaming; emit resolved tool_call Index; remove internal splitJSONL.
Bedrock provider
core/providers/bedrock/chat.go, core/providers/bedrock/bedrock.go
Add BedrockStreamState and NewBedrockStreamState(), change ToBifrostChatCompletionStream to accept state, wire state creation into stream entry, and use state to assign/resolve tool-call indices.
Gemini provider & tests
core/providers/gemini/chat.go, core/providers/gemini/gemini.go, core/providers/gemini/gemini_test.go
Add GeminiStreamState and NewGeminiStreamState(), change ToBifrostChatCompletionStream to accept state, switch to state-based per-chunk indexing, and update tests to pass new state.
Cohere provider
core/providers/cohere/chat.go
Propagate chunk.Index into emitted streaming tool_call.Index when present.
Provider wiring (misc)
core/providers/... (Anthropic, Bedrock, Gemini, etc.)
Create per-stream state via New*StreamState() in streaming handlers and pass state into ToBifrostChatCompletionStream conversions for consistent indexing.
Test tweak
core/providers/xai/xai_test.go
Updated VisionModel test string.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant Provider as ProviderStream
  participant State as StreamState
  participant Converter as Converter
  participant Bifrost as Bifrost
  Client->>Provider: open streaming request
  Provider->>State: New*StreamState()
  Provider->>Converter: emit chunk (Start/Delta) with ContentBlockIndex
  Converter->>State: lookup or assign toolCallIdx for ContentBlockIndex
  State-->>Converter: toolCallIdx
  Converter->>Bifrost: emit tool_call delta with Index = toolCallIdx
  Bifrost-->>Client: stream chunk with indexed tool_call
  Provider->>Provider: continue emitting chunks...
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hop through streams with nimble feet,

numbering calls so each one meets.
Parallel paths no longer clash,
indices tidy—no more mash.
I count, I mark, and send with cheer.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: tool call indexes in chat completions stream' clearly and specifically describes the main change: fixing tool call indices in streaming chat completions.
Docstring Coverage ✅ Passed Docstring coverage is 92.31% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The PR description follows the template structure with all required sections completed: Summary, Changes, Type of change, Affected areas, How to test, Breaking changes, Related issues, Security considerations, and Checklist. All critical sections are present and adequately filled out.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 03-07-fix_tool_call_indexes_in_chat_completions_stream

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@TejasGhatte TejasGhatte marked this pull request as ready for review March 7, 2026 12:36
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/internal/llmtests/utils.go (1)

395-400: ⚠️ Potential issue | 🟡 Minor

Populate ToolCallInfo.Index and default missing indices to -1.

ToolCallInfo now documents -1 as “not available”, but ExtractChatToolCalls still builds zero-valued entries and never copies toolCall.Index. That makes “missing index” indistinguishable from the first tool call and weakens the new index-sequencing assertions. Apply the same Index: -1 default in ExtractResponsesToolCalls too.

🛠️ Proposed fix
 type ToolCallInfo struct {
 	Name      string
 	Arguments string
 	ID        string
 	Index     int // OpenAI tool_calls index (0, 1, 2, ...); -1 when not available
 }
@@
-				info := ToolCallInfo{}
+				info := ToolCallInfo{Index: -1}
 				if toolCall.ID != nil {
 					info.ID = *toolCall.ID
 				}
 				if toolCall.Function.Name != nil {
 					info.Name = *toolCall.Function.Name
 				}
 				info.Arguments = toolCall.Function.Arguments
+				info.Index = int(toolCall.Index)
 				toolCalls = append(toolCalls, info)

Also applies to: 507-515

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/utils.go` around lines 395 - 400, ExtractChatToolCalls
and ExtractResponsesToolCalls are creating ToolCallInfo with the zero-valued
Index (0) so missing indices can't be distinguished from the first tool call;
update both functions to initialize Index: -1 by default and, when a parsed
toolCall has an index, copy that value into ToolCallInfo.Index (use the
toolCall.Index or equivalent field from the parsed structure) so absent indices
remain -1 and present indices reflect the original sequence (refer to the
ToolCallInfo type and the ExtractChatToolCalls / ExtractResponsesToolCalls
functions when making the change).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/internal/llmtests/multiple_tool_calls.go`:
- Around line 349-419: The validator currently skips chunks where
response.BifrostResponsesStreamResponse == nil; change this to treat any
non-response chunk as a failure by returning a ResponsesStreamValidationResult
with Passed:false and an error describing the unexpected non-response chunk
(include any available error/metadata from the response if present). Locate the
block handling the receive from responseChannel (the case response, ok :=
<-responseChannel) and replace the silent skip of nil
BifrostResponsesStreamResponse with code that constructs and returns a failing
ResponsesStreamValidationResult (refer to ResponsesStreamValidationResult and
include identifying info from response where possible), leaving the existing
accumulator.AccumulateResponsesToolCall logic unchanged for non-nil
BifrostResponsesStreamResponse.
- Around line 231-266: The stream is created with schemas.NoDeadline so a
stalled provider can hang; wrap the request in a bounded, cancellable context
and ensure you cancel it when done: create a ctx, cancel :=
context.WithTimeout(parentCtx, <sensible timeout>) (or WithCancel if upper-level
deadline exists), pass schemas.NewBifrostContext(ctx, ...) into
client.ChatCompletionStreamRequest inside WithStreamRetry (instead of
schemas.NoDeadline), and call cancel() when you break out of the response loop
(e.g., when responseCount > 500) or after the loop to terminate the producer;
make these changes around the WithStreamRetry call and where responseChannel is
consumed (referencing WithStreamRetry, NewBifrostContext, schemas.NoDeadline,
client.ChatCompletionStreamRequest, responseChannel, responseCount).
- Around line 336-345: The current code creates a streamCtx timeout but still
starts the producer with schemas.NewBifrostContext(ctx, schemas.NoDeadline) so
the underlying ResponsesStreamRequest keeps running after timeout; to fix,
create the streamCtx (context.WithTimeout) before building bfCtx and pass that
streamCtx into schemas.NewBifrostContext when calling
client.ResponsesStreamRequest so the request is cancelled when streamCtx times
out, and ensure you still call cancel() (defer) as shown; apply the same change
to the other occurrence around lines 428-434 (the
WithResponsesStreamValidationRetry wrapper that calls ResponsesStreamRequest).

In `@core/providers/anthropic/chat.go`:
- Around line 685-695: The AnthropicStreamState mapping can be nil or missing
entries which causes panics and silent reuse of index 0; ensure
NewAnthropicStreamState always initialises contentBlockToToolCallIdx (already
done) and add defensive guards in methods that access it (eg. wherever
content_block_start and tool start are handled) to: check if
s.contentBlockToToolCallIdx == nil and allocate a new map before reads/writes,
validate existence of a mapping for a contentBlock key before using it (do not
default to 0), and when a missing content_block_start is detected create a fresh
mapping entry using s.nextToolCallIndex and then increment nextToolCallIndex to
avoid index reuse; update any code paths that read/write
nextToolCallIndex/contentBlockToToolCallIdx to follow this pattern.

---

Outside diff comments:
In `@core/internal/llmtests/utils.go`:
- Around line 395-400: ExtractChatToolCalls and ExtractResponsesToolCalls are
creating ToolCallInfo with the zero-valued Index (0) so missing indices can't be
distinguished from the first tool call; update both functions to initialize
Index: -1 by default and, when a parsed toolCall has an index, copy that value
into ToolCallInfo.Index (use the toolCall.Index or equivalent field from the
parsed structure) so absent indices remain -1 and present indices reflect the
original sequence (refer to the ToolCallInfo type and the ExtractChatToolCalls /
ExtractResponsesToolCalls functions when making the change).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5ee33b61-1f23-450d-bb0f-6dc964aa4c3d

📥 Commits

Reviewing files that changed from the base of the PR and between d3cee58 and 530d749.

📒 Files selected for processing (13)
  • core/changelog.md
  • core/internal/llmtests/multiple_tool_calls.go
  • core/internal/llmtests/utils.go
  • core/providers/anthropic/anthropic.go
  • core/providers/anthropic/chat.go
  • core/providers/bedrock/bedrock.go
  • core/providers/bedrock/chat.go
  • core/providers/cohere/chat.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/gemini_test.go
  • core/providers/xai/xai_test.go
  • transports/changelog.md

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


tejas ghatte seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@TejasGhatte TejasGhatte force-pushed the 03-07-fix_tool_call_indexes_in_chat_completions_stream branch from 530d749 to fc2da80 Compare March 8, 2026 19:39
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/internal/llmtests/utils.go (1)

395-399: ⚠️ Potential issue | 🟠 Major

Initialize ToolCallInfo.Index to -1 when it is unavailable.

The new field is documented as "-1 when not available", but the extractor still builds the zero value. That makes "index missing" indistinguishable from a real first tool call and can mask the sequencing regression this PR is trying to catch. Please default Index to -1 here, and mirror the same default in ExtractResponsesToolCalls.

Suggested fix
-				info := ToolCallInfo{}
+				info := ToolCallInfo{Index: -1}
 				if toolCall.ID != nil {
 					info.ID = *toolCall.ID
 				}
 				if toolCall.Function.Name != nil {
 					info.Name = *toolCall.Function.Name
 				}
 				info.Arguments = toolCall.Function.Arguments

Also applies to: 507-515

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/utils.go` around lines 395 - 399, The
ToolCallInfo.Index must default to -1 so "missing" is distinguishable from the
first tool call; update all places that construct ToolCallInfo (including where
you declare the zero value and in ExtractResponsesToolCalls) to explicitly set
Index: -1 when no index is available, and ensure ExtractResponsesToolCalls
mirrors this behavior so any created ToolCallInfo uses Index -1 unless a real
OpenAI tool_calls index is assigned. Target the ToolCallInfo creation sites and
the ExtractResponsesToolCalls function to set Index to -1 by default.
♻️ Duplicate comments (1)
core/providers/anthropic/chat.go (1)

691-695: ⚠️ Potential issue | 🟠 Major

Defensively initialize AnthropicStreamState at the converter boundary.

ToBifrostChatCompletionStream still assumes state and state.contentBlockToToolCallIdx are usable. A nil state will panic on Line 717, and a zero-value AnthropicStreamState{} will panic on the first map write. The input-delta path also falls back to index 0 when state == nil, which corrupts sequencing instead of failing closed.

Suggested hardening
 func (chunk *AnthropicStreamEvent) ToBifrostChatCompletionStream(ctx *schemas.BifrostContext, structuredOutputToolName string, state *AnthropicStreamState) (*schemas.BifrostChatResponse, *schemas.BifrostError, bool) {
+	if state == nil {
+		state = NewAnthropicStreamState()
+	} else if state.contentBlockToToolCallIdx == nil {
+		state.contentBlockToToolCallIdx = make(map[int]int)
+	}
+
 	switch chunk.Type {
@@
-					var toolCallIdx int
-					var ok bool
-					if state != nil {
-						toolCallIdx, ok = state.contentBlockToToolCallIdx[*chunk.Index]
-						if !ok {
-							return nil, nil, false
-						}
-					}
+					toolCallIdx, ok := state.contentBlockToToolCallIdx[*chunk.Index]
+					if !ok {
+						return nil, nil, false
+					}

Also applies to: 699-719, 796-804

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/anthropic/chat.go` around lines 691 - 695, To prevent
nil-pointer and map write panics in ToBifrostChatCompletionStream, defensively
handle the AnthropicStreamState at the converter boundary: if the incoming state
is nil, allocate one via NewAnthropicStreamState() (or instantiate
AnthropicStreamState with contentBlockToToolCallIdx initialized) before any
reads/writes; ensure any writes to state.contentBlockToToolCallIdx always happen
after confirming the map is non-nil (initialize it if needed). Also change the
input-delta path that currently falls back to index 0 when state == nil to fail
closed (return an error or stop processing that delta) instead of silently using
index 0 so sequencing cannot be corrupted.
🧹 Nitpick comments (1)
core/providers/bedrock/chat.go (1)

365-372: Consider adding a map existence check for defensive handling.

The map lookup doesn't verify the key exists. If a delta event arrives for a content block that wasn't seen in a start event (edge case), this silently uses index 0. The Anthropic implementation checks ok and returns early (see context snippet 4, lines 800-803).

🛡️ Optional defensive improvement
 			toolCallIdx := 0
 			if chunk.ContentBlockIndex != nil {
-				toolCallIdx = state.contentBlockToToolCallIdx[*chunk.ContentBlockIndex]
+				var ok bool
+				toolCallIdx, ok = state.contentBlockToToolCallIdx[*chunk.ContentBlockIndex]
+				if !ok {
+					return nil, nil, false
+				}
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/chat.go` around lines 365 - 372, The map lookup for
state.contentBlockToToolCallIdx is not checked for existence, so when
chunk.ContentBlockIndex is set but the key is missing you silently default to 0;
update the logic around chunk.ContentBlockIndex and toolCallIdx in the function
handling deltas (the code that constructs var toolCall
schemas.ChatAssistantMessageToolCall and sets toolCall.Index) to perform a
comma-ok map lookup (e.g., idx, ok :=
state.contentBlockToToolCallIdx[*chunk.ContentBlockIndex]) and if the key is
missing handle it defensively — either return/skip processing this delta or
surface an error — rather than falling back to index 0.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/internal/llmtests/multiple_tool_calls.go`:
- Around line 282-297: The test currently hard-fails when a streaming tool call
has an empty tc.ID; change the strict ID presence check in the loop that builds
toolsFound (and the analogous responses block at the other location) to allow
empty IDs—only require Name and Arguments to be non-empty—and avoid failing on
tc.ID == "". Also update the shared validator validateStreamingToolCalls in
core/internal/llmtests/tool_calls_streaming.go to tolerate nil/empty IDs (use
Name/Arguments/Index for validation instead) so streamed tool calls that omit
IDs but have correct names/arguments/indexes no longer fail; apply the same
relaxation to the other occurrence referenced in the comment.
- Around line 272-298: The test currently never asserts tool-call indices; after
calling accumulator.GetFinalChatToolCalls() iterate finalToolCalls and validate
each ToolCall's Index (tc.Index) is present and that indices form a monotonic
sequential range starting at 0 (e.g., 0..len(finalToolCalls)-1) or at least
strictly increasing without duplicates; if any index is missing, out of order,
duplicated, or negative, call t.Fatalf with a descriptive message. Use the
existing variables (finalToolCalls, tc) and the test harness (t.Fatalf) so this
check runs alongside the existing field and tool-name assertions in the
MultipleToolCallsStreamingChatCompletions test before calling
validateStreamingToolCalls. Ensure the failure message prints the offending
index and the full list of observed indices for debugging.

---

Outside diff comments:
In `@core/internal/llmtests/utils.go`:
- Around line 395-399: The ToolCallInfo.Index must default to -1 so "missing" is
distinguishable from the first tool call; update all places that construct
ToolCallInfo (including where you declare the zero value and in
ExtractResponsesToolCalls) to explicitly set Index: -1 when no index is
available, and ensure ExtractResponsesToolCalls mirrors this behavior so any
created ToolCallInfo uses Index -1 unless a real OpenAI tool_calls index is
assigned. Target the ToolCallInfo creation sites and the
ExtractResponsesToolCalls function to set Index to -1 by default.

---

Duplicate comments:
In `@core/providers/anthropic/chat.go`:
- Around line 691-695: To prevent nil-pointer and map write panics in
ToBifrostChatCompletionStream, defensively handle the AnthropicStreamState at
the converter boundary: if the incoming state is nil, allocate one via
NewAnthropicStreamState() (or instantiate AnthropicStreamState with
contentBlockToToolCallIdx initialized) before any reads/writes; ensure any
writes to state.contentBlockToToolCallIdx always happen after confirming the map
is non-nil (initialize it if needed). Also change the input-delta path that
currently falls back to index 0 when state == nil to fail closed (return an
error or stop processing that delta) instead of silently using index 0 so
sequencing cannot be corrupted.

---

Nitpick comments:
In `@core/providers/bedrock/chat.go`:
- Around line 365-372: The map lookup for state.contentBlockToToolCallIdx is not
checked for existence, so when chunk.ContentBlockIndex is set but the key is
missing you silently default to 0; update the logic around
chunk.ContentBlockIndex and toolCallIdx in the function handling deltas (the
code that constructs var toolCall schemas.ChatAssistantMessageToolCall and sets
toolCall.Index) to perform a comma-ok map lookup (e.g., idx, ok :=
state.contentBlockToToolCallIdx[*chunk.ContentBlockIndex]) and if the key is
missing handle it defensively — either return/skip processing this delta or
surface an error — rather than falling back to index 0.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 291967df-e47a-4db2-9eca-ca4f9300029a

📥 Commits

Reviewing files that changed from the base of the PR and between 530d749 and fc2da80.

📒 Files selected for processing (13)
  • core/changelog.md
  • core/internal/llmtests/multiple_tool_calls.go
  • core/internal/llmtests/utils.go
  • core/providers/anthropic/anthropic.go
  • core/providers/anthropic/chat.go
  • core/providers/bedrock/bedrock.go
  • core/providers/bedrock/chat.go
  • core/providers/cohere/chat.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/gemini_test.go
  • core/providers/xai/xai_test.go
  • transports/changelog.md
🚧 Files skipped from review as they are similar to previous changes (5)
  • core/providers/gemini/gemini_test.go
  • core/changelog.md
  • transports/changelog.md
  • core/providers/xai/xai_test.go
  • core/providers/cohere/chat.go

@TejasGhatte TejasGhatte force-pushed the 03-07-fix_tool_call_indexes_in_chat_completions_stream branch from fc2da80 to aeaa863 Compare March 9, 2026 03:50
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/providers/anthropic/chat.go (1)

707-714: ⚠️ Potential issue | 🟠 Major

Track structured-output by block index, not just by tool name presence.

Line 778 switches every input_json delta to delta.content whenever structuredOutputToolName is non-empty. If the stream contains the structured-output tool plus a normal tool, the normal tool’s JSON never reaches delta.tool_calls, so its indexed tool-call stream is lost.

Suggested direction
 type AnthropicStreamState struct {
 	nextToolCallIndex         int
 	contentBlockToToolCallIdx map[int]int
+	structuredOutputBlocks    map[int]struct{}
 }
 
 // NewAnthropicStreamState returns an initialised stream state for one streaming response.
 func NewAnthropicStreamState() *AnthropicStreamState {
 	return &AnthropicStreamState{
 		contentBlockToToolCallIdx: make(map[int]int),
+		structuredOutputBlocks:    make(map[int]struct{}),
 	}
 }
@@
 	if chunk.Index != nil && chunk.ContentBlock != nil && chunk.ContentBlock.Type == AnthropicContentBlockTypeToolUse {
 		// Check if this is the structured output tool - if so, skip emitting tool call metadata
 		if structuredOutputToolName != "" && chunk.ContentBlock.Name != nil && *chunk.ContentBlock.Name == structuredOutputToolName {
+			state.structuredOutputBlocks[*chunk.Index] = struct{}{}
 			// Skip emitting tool call for structured output - it will be emitted as content later
 			return nil, nil, false
 		}
@@
 		case AnthropicStreamDeltaTypeInputJSON:
 			// Handle tool use streaming - accumulate partial JSON
 			if chunk.Delta.PartialJSON != nil {
-				if structuredOutputToolName != "" {
+				if _, isStructuredOutput := state.structuredOutputBlocks[*chunk.Index]; isStructuredOutput {
 					// Structured output: stream JSON as content
 					streamResponse := &schemas.BifrostChatResponse{

Also applies to: 775-804

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/anthropic/chat.go` around lines 707 - 714, The current logic
uses structuredOutputToolName to gate special handling and ends up remapping
every input_json to content when any structured tool name is present; instead
record the structured-output block's index when you detect it in
AnthropicStreamEventTypeContentBlockStart (use chunk.Index and
chunk.ContentBlock.Name) — e.g. set a structuredOutputBlockIndex variable when
*this* block matches structuredOutputToolName — then change subsequent checks
(including where delta.input_json gets mapped to delta.content and where
tool-call emission is skipped) to compare chunk.Index against
structuredOutputBlockIndex rather than just checking structuredOutputToolName
being non-empty; also ensure you clear or reset structuredOutputBlockIndex
appropriately when a block ends so normal tool_call streams for other blocks
still populate delta.tool_calls.
🧹 Nitpick comments (1)
core/providers/bedrock/bedrock.go (1)

1076-1077: Consider pooling BedrockStreamState for consistency with ResponsesStream.

ResponsesStream (line 1404) uses pooled state management with acquireBedrockResponsesStreamState() / releaseBedrockResponsesStreamState(), while ChatCompletionStream creates state directly. For consistency and potential memory efficiency under high load, consider adding a pool for BedrockStreamState.

That said, BedrockStreamState is a small struct (int + map), so the current approach is acceptable and the pool optimization could be deferred.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/bedrock.go` around lines 1076 - 1077, Replace the
direct allocation streamState := NewBedrockStreamState() with the pooled
workflow used by ResponsesStream: call acquireBedrockResponsesStreamState() to
obtain the state at the start of ChatCompletionStream (or the function using
NewBedrockStreamState) and ensure you call
releaseBedrockResponsesStreamState(state) when the stream completes (use defer
immediately after acquire to guarantee release on all code paths); keep all
existing uses of the state unchanged so only the creation/release calls are
swapped in.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/internal/llmtests/multiple_tool_calls.go`:
- Around line 272-296: GetFinalChatToolCalls currently iterates the
acc.ChatToolCalls map in arbitrary order and never sets ToolCallInfo.Index,
causing tc.Index to be 0 for all entries; fix it by extracting the keys from
acc.ChatToolCalls (map[int]*schemas.ChatAssistantMessageToolCall), sort them
(using sort.Ints), then iterate in key order, create each ToolCallInfo with
Index set to the key and populate ID, Name (from toolCall.Function.Name) and
Arguments from toolCall.Function.Arguments before appending to the result; also
add the "sort" import to tool_calls_streaming.go.

In `@core/internal/llmtests/utils.go`:
- Around line 399-400: The ToolCallInfo.Index field is ambiguous because its
comment says "-1 when not available" but Go ints default to 0 and
ExtractChatToolCalls leaves Index at 0 for non-streaming paths; update
ExtractChatToolCalls (and any other constructors of ToolCallInfo) to explicitly
initialize Index to -1 when there is no known index so consumers can distinguish
"not available" from the first tool call, or alternatively change the
ToolCallInfo comment to state that Index is only populated by streaming
extraction; locate usages around the ToolCallInfo type and the
ExtractChatToolCalls function and set Index = -1 on creation for the
not-available case.

In `@core/providers/bedrock/chat.go`:
- Around line 270-280: The BedrockStreamState can be a zero-value and cause
panics or silent mis-indexing; ensure any use-sites that read or write
nextToolCallIndex and contentBlockToToolCallIdx first validate/initialise the
state (e.g., if s == nil or s.contentBlockToToolCallIdx == nil then set s =
NewBedrockStreamState() or initialise the map) and, when looking up a
start-event mapping in contentBlockToToolCallIdx (used when mapping tool deltas
to indices), do not silently fallback to 0—detect missing entries and either
create a proper mapping from s.nextToolCallIndex or return an error/explicit
sentinel so callers can handle sequencing correctly; update functions that
reference BedrockStreamState.nextToolCallIndex and
BedrockStreamState.contentBlockToToolCallIdx to perform these nil/absence checks
before indexing.

---

Outside diff comments:
In `@core/providers/anthropic/chat.go`:
- Around line 707-714: The current logic uses structuredOutputToolName to gate
special handling and ends up remapping every input_json to content when any
structured tool name is present; instead record the structured-output block's
index when you detect it in AnthropicStreamEventTypeContentBlockStart (use
chunk.Index and chunk.ContentBlock.Name) — e.g. set a structuredOutputBlockIndex
variable when *this* block matches structuredOutputToolName — then change
subsequent checks (including where delta.input_json gets mapped to delta.content
and where tool-call emission is skipped) to compare chunk.Index against
structuredOutputBlockIndex rather than just checking structuredOutputToolName
being non-empty; also ensure you clear or reset structuredOutputBlockIndex
appropriately when a block ends so normal tool_call streams for other blocks
still populate delta.tool_calls.

---

Nitpick comments:
In `@core/providers/bedrock/bedrock.go`:
- Around line 1076-1077: Replace the direct allocation streamState :=
NewBedrockStreamState() with the pooled workflow used by ResponsesStream: call
acquireBedrockResponsesStreamState() to obtain the state at the start of
ChatCompletionStream (or the function using NewBedrockStreamState) and ensure
you call releaseBedrockResponsesStreamState(state) when the stream completes
(use defer immediately after acquire to guarantee release on all code paths);
keep all existing uses of the state unchanged so only the creation/release calls
are swapped in.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a07a3b15-27df-4e3c-95db-78fb1b2d9b3f

📥 Commits

Reviewing files that changed from the base of the PR and between fc2da80 and aeaa863.

📒 Files selected for processing (13)
  • core/changelog.md
  • core/internal/llmtests/multiple_tool_calls.go
  • core/internal/llmtests/utils.go
  • core/providers/anthropic/anthropic.go
  • core/providers/anthropic/chat.go
  • core/providers/bedrock/bedrock.go
  • core/providers/bedrock/chat.go
  • core/providers/cohere/chat.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/gemini_test.go
  • core/providers/xai/xai_test.go
  • transports/changelog.md
🚧 Files skipped from review as they are similar to previous changes (5)
  • core/providers/cohere/chat.go
  • core/providers/gemini/gemini_test.go
  • core/changelog.md
  • transports/changelog.md
  • core/providers/xai/xai_test.go

@TejasGhatte TejasGhatte force-pushed the 03-07-fix_tool_call_indexes_in_chat_completions_stream branch from aeaa863 to e304940 Compare March 9, 2026 04:16
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/providers/anthropic/chat.go (1)

781-800: ⚠️ Potential issue | 🟠 Major

Only treat input_json as content for the structured-output block.

This branch keys solely off structuredOutputToolName != "", so once structured output is enabled every input_json delta is converted into content — including regular tool calls. The caller only suppresses the active structured-output block; this helper still needs per-content-block discrimination here.

♻️ Duplicate comments (2)
core/internal/llmtests/multiple_tool_calls.go (2)

335-344: ⚠️ Potential issue | 🟠 Major

Wire the timeout into the actual stream request.

streamCtx only bounds the local select; the request itself still starts with schemas.NewBifrostContext(ctx, schemas.NoDeadline). On timeout or retry, the producer keeps running in the background and retries can stack live streams.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/multiple_tool_calls.go` around lines 335 - 344, The
stream timeout isn't applied to the actual ResponsesStreamRequest: in the
closure passed to WithResponsesStreamValidationRetry you create streamCtx but
still call schemas.NewBifrostContext(ctx, schemas.NoDeadline), so the producer
keeps running after timeout/retry. Change the call inside that closure to use
the bounded context (e.g., schemas.NewBifrostContext(streamCtx,
schemas.NoDeadline) or otherwise derive the Bifrost context from streamCtx) so
ResponsesStreamRequest is started with streamCtx; keep the existing cancel() and
retry logic intact.

296-296: ⚠️ Potential issue | 🟠 Major

These new streaming scenarios still hard-require tool-call IDs.

Both calls go through validateStreamingToolCalls, which still fails on empty ID. That reintroduces provider failures for valid streamed tool calls that only populate name/arguments/index.

Also applies to: 495-495

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/multiple_tool_calls.go` at line 296,
validateStreamingToolCalls currently enforces a non-empty tool-call ID for
streamed tool calls, which breaks providers that emit only name/arguments/index;
update the validation in validateStreamingToolCalls so it does not hard-require
ID (allow empty ID) and instead verifies presence of name, arguments, and index
on entries in finalToolCalls (and any similar checks at the other call site) —
adjust any error messages and conditions to accept empty IDs but still validate
the required fields (Name/Arguments/Index) to prevent regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/internal/llmtests/multiple_tool_calls.go`:
- Around line 495-499: The call to validateStreamingToolCalls currently aborts
the test via t.Fatalf/require.NotEmpty, preventing retry; change
validateStreamingToolCalls (and any helper it uses) to be non-fatal by returning
an error or a validation result instead of calling t.Fatalf/require.*; update
its internal assertions (e.g., require.NotEmpty) to return descriptive errors
(or false) so callers (like the code in multiple_tool_calls.go that invokes
validateStreamingToolCalls and is used by WithResponsesStreamValidationRetry)
can inspect the result and return ResponsesStreamValidationResult{Passed:false,
ReceivedData:...} for retries rather than letting the test abort. Ensure the
caller checks the returned error/validation result and only marks Passed true
when validation succeeds.

---

Duplicate comments:
In `@core/internal/llmtests/multiple_tool_calls.go`:
- Around line 335-344: The stream timeout isn't applied to the actual
ResponsesStreamRequest: in the closure passed to
WithResponsesStreamValidationRetry you create streamCtx but still call
schemas.NewBifrostContext(ctx, schemas.NoDeadline), so the producer keeps
running after timeout/retry. Change the call inside that closure to use the
bounded context (e.g., schemas.NewBifrostContext(streamCtx, schemas.NoDeadline)
or otherwise derive the Bifrost context from streamCtx) so
ResponsesStreamRequest is started with streamCtx; keep the existing cancel() and
retry logic intact.
- Line 296: validateStreamingToolCalls currently enforces a non-empty tool-call
ID for streamed tool calls, which breaks providers that emit only
name/arguments/index; update the validation in validateStreamingToolCalls so it
does not hard-require ID (allow empty ID) and instead verifies presence of name,
arguments, and index on entries in finalToolCalls (and any similar checks at the
other call site) — adjust any error messages and conditions to accept empty IDs
but still validate the required fields (Name/Arguments/Index) to prevent
regressions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1b325b70-75bb-4a37-ba99-27b723a1dd75

📥 Commits

Reviewing files that changed from the base of the PR and between aeaa863 and e304940.

📒 Files selected for processing (14)
  • core/changelog.md
  • core/internal/llmtests/multiple_tool_calls.go
  • core/internal/llmtests/tool_calls_streaming.go
  • core/internal/llmtests/utils.go
  • core/providers/anthropic/anthropic.go
  • core/providers/anthropic/chat.go
  • core/providers/bedrock/bedrock.go
  • core/providers/bedrock/chat.go
  • core/providers/cohere/chat.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/gemini_test.go
  • core/providers/xai/xai_test.go
  • transports/changelog.md
✅ Files skipped from review due to trivial changes (1)
  • transports/changelog.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/internal/llmtests/utils.go
  • core/providers/bedrock/chat.go
  • core/providers/bedrock/bedrock.go
  • core/changelog.md

@TejasGhatte TejasGhatte force-pushed the 03-07-fix_tool_call_indexes_in_chat_completions_stream branch from e304940 to 3d25688 Compare March 9, 2026 06:12
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
core/internal/llmtests/tool_calls_streaming.go (1)

755-768: ⚠️ Potential issue | 🟠 Major

Don't make streamed tool-call IDs mandatory here.

Some providers emit valid streamed tool calls without an id. This shared validator now turns those cases into hard failures, which makes the new cross-provider streaming tests fail for the wrong reason and undoes the ID-tolerance added elsewhere in this stack.

Suggested relaxation
 func validateStreamingToolCalls(toolCalls []ToolCallInfo, apiName string) error {
 	if len(toolCalls) == 0 {
 		return fmt.Errorf("%s: no tool calls found in streaming response", apiName)
 	}

 	for i, toolCall := range toolCalls {
-		if toolCall.ID == "" {
-			return fmt.Errorf("%s: tool call %d missing ID", apiName, i)
-		}
 		if toolCall.Name == "" {
 			return fmt.Errorf("%s: tool call %d missing name", apiName, i)
 		}
 		if toolCall.Arguments == "" {
 			return fmt.Errorf("%s: tool call %d missing arguments", apiName, i)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/tool_calls_streaming.go` around lines 755 - 768, The
validator validateStreamingToolCalls currently treats an empty toolCall.ID as an
error; remove or relax that check so streamed tool calls without an ID are
accepted (i.e., do not return an error when toolCall.ID == ""); keep the checks
for Name and Arguments intact and still return errors for those missing values;
locate this behavior in validateStreamingToolCalls (iterating over toolCalls
[]ToolCallInfo with apiName) and simply skip or ignore the ID emptiness instead
of returning fmt.Errorf.
core/internal/llmtests/multiple_tool_calls.go (1)

337-346: ⚠️ Potential issue | 🟠 Major

Cancel the actual Responses stream, not just the local select.

streamCtx only bounds the validator. The request itself still runs under ctx with schemas.NoDeadline, so a timeout or any other early return leaves the producer running in the background, and retries can stack multiple live streams.

One way to wire the timeout through the whole attempt
-		validationResult := WithResponsesStreamValidationRetry(t, retryConfig, retryContext,
+		var attemptCtx context.Context
+		var attemptCancel context.CancelFunc
+		validationResult := WithResponsesStreamValidationRetry(t, retryConfig, retryContext,
 			func() (chan *schemas.BifrostStreamChunk, *schemas.BifrostError) {
-				bfCtx := schemas.NewBifrostContext(ctx, schemas.NoDeadline)
+				if attemptCancel != nil {
+					attemptCancel()
+				}
+				attemptCtx, attemptCancel = context.WithTimeout(ctx, 200*time.Second)
+				bfCtx := schemas.NewBifrostContext(attemptCtx, schemas.NoDeadline)
 				return client.ResponsesStreamRequest(bfCtx, request)
 			},
 			func(responseChannel chan *schemas.BifrostStreamChunk) ResponsesStreamValidationResult {
+				defer func() {
+					if attemptCancel != nil {
+						attemptCancel()
+					}
+				}()
 				accumulator := NewStreamingToolCallAccumulator()
 				var responseCount int
-				streamCtx, cancel := context.WithTimeout(ctx, 200*time.Second)
-				defer cancel()

 				for {
 					select {
@@
-					case <-streamCtx.Done():
+					case <-attemptCtx.Done():

Also applies to: 438-443

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/multiple_tool_calls.go` around lines 337 - 346, The
validator currently creates streamCtx and cancels only the local validation but
the actual ResponsesStreamRequest is still created with schemas.NoDeadline and
the outer ctx, leaving the producer running; change the call inside
WithResponsesStreamValidationRetry to create the Bifrost context from streamCtx
(e.g., use schemas.NewBifrostContext(streamCtx, schemas.NoDeadline) or otherwise
pass streamCtx into ResponsesStreamRequest) so the request is bound to the same
timeout/cancelation and ensure cancel() is deferred as already present; apply
the same fix to the other occurrence referenced (the block around lines 438-443)
so retries don't spawn background producers.
core/providers/bedrock/chat.go (1)

313-317: ⚠️ Potential issue | 🟠 Major

Don't silently collapse unmapped tool chunks onto index 0.

If Line 314 doesn't record a mapping, Line 373 reads the map's zero value and emits index 0 again. That silently mis-sequences parallel tool calls and undermines the main Bedrock fix in this PR.

Suggested hardening
-		toolCallIdx := 0
-		if chunk.ContentBlockIndex != nil {
-			toolCallIdx = state.nextToolCallIndex
-			state.contentBlockToToolCallIdx[*chunk.ContentBlockIndex] = toolCallIdx
-			state.nextToolCallIndex++
-		}
+		if chunk.ContentBlockIndex == nil {
+			return nil, nil, false
+		}
+		toolCallIdx := state.nextToolCallIndex
+		state.contentBlockToToolCallIdx[*chunk.ContentBlockIndex] = toolCallIdx
+		state.nextToolCallIndex++
@@
-			toolCallIdx := 0
-			if chunk.ContentBlockIndex != nil {
-				toolCallIdx = state.contentBlockToToolCallIdx[*chunk.ContentBlockIndex]
-			}
+			if chunk.ContentBlockIndex == nil {
+				return nil, nil, false
+			}
+			toolCallIdx, ok := state.contentBlockToToolCallIdx[*chunk.ContentBlockIndex]
+			if !ok {
+				return nil, nil, false
+			}

Also applies to: 371-374

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/chat.go` around lines 313 - 317, The code silently
assigns toolCallIdx = 0 when chunk.ContentBlockIndex is nil, causing unmapped
chunks to appear as index 0 and mis-sequence parallel tool calls; update the
logic in the chat processing (around the variables toolCallIdx,
chunk.ContentBlockIndex, state.nextToolCallIndex, and
state.contentBlockToToolCallIdx) so you only set and read mappings when
chunk.ContentBlockIndex != nil (i.e., create an entry and increment
state.nextToolCallIndex only for non-nil ContentBlockIndex) and when later
retrieving an index from state.contentBlockToToolCallIdx check the map presence
(ok pattern) instead of using the zero value—use an explicit sentinel or handle
the absence path instead of defaulting to 0.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/providers/gemini/chat.go`:
- Line 289: The ToBifrostChatCompletionStream function currently dereferences
state without checking for nil, causing a panic when Gemini streams a
function-call chunk; update
GenerateContentResponse.ToBifrostChatCompletionStream to guard against a nil
*GeminiStreamState at the start (either instantiate a default GeminiStreamState
struct when state == nil or return a clear BifrostError indicating a missing
stream state) and then proceed, ensuring any later uses (e.g., where state
fields are read around the function-call handling) are safe; reference the
ToBifrostChatCompletionStream function and the GeminiStreamState type when
making the change.

---

Duplicate comments:
In `@core/internal/llmtests/multiple_tool_calls.go`:
- Around line 337-346: The validator currently creates streamCtx and cancels
only the local validation but the actual ResponsesStreamRequest is still created
with schemas.NoDeadline and the outer ctx, leaving the producer running; change
the call inside WithResponsesStreamValidationRetry to create the Bifrost context
from streamCtx (e.g., use schemas.NewBifrostContext(streamCtx,
schemas.NoDeadline) or otherwise pass streamCtx into ResponsesStreamRequest) so
the request is bound to the same timeout/cancelation and ensure cancel() is
deferred as already present; apply the same fix to the other occurrence
referenced (the block around lines 438-443) so retries don't spawn background
producers.

In `@core/internal/llmtests/tool_calls_streaming.go`:
- Around line 755-768: The validator validateStreamingToolCalls currently treats
an empty toolCall.ID as an error; remove or relax that check so streamed tool
calls without an ID are accepted (i.e., do not return an error when toolCall.ID
== ""); keep the checks for Name and Arguments intact and still return errors
for those missing values; locate this behavior in validateStreamingToolCalls
(iterating over toolCalls []ToolCallInfo with apiName) and simply skip or ignore
the ID emptiness instead of returning fmt.Errorf.

In `@core/providers/bedrock/chat.go`:
- Around line 313-317: The code silently assigns toolCallIdx = 0 when
chunk.ContentBlockIndex is nil, causing unmapped chunks to appear as index 0 and
mis-sequence parallel tool calls; update the logic in the chat processing
(around the variables toolCallIdx, chunk.ContentBlockIndex,
state.nextToolCallIndex, and state.contentBlockToToolCallIdx) so you only set
and read mappings when chunk.ContentBlockIndex != nil (i.e., create an entry and
increment state.nextToolCallIndex only for non-nil ContentBlockIndex) and when
later retrieving an index from state.contentBlockToToolCallIdx check the map
presence (ok pattern) instead of using the zero value—use an explicit sentinel
or handle the absence path instead of defaulting to 0.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3f34e5c5-7526-4c40-b209-f04436702503

📥 Commits

Reviewing files that changed from the base of the PR and between e304940 and 3d25688.

📒 Files selected for processing (14)
  • core/changelog.md
  • core/internal/llmtests/multiple_tool_calls.go
  • core/internal/llmtests/tool_calls_streaming.go
  • core/internal/llmtests/utils.go
  • core/providers/anthropic/anthropic.go
  • core/providers/anthropic/chat.go
  • core/providers/bedrock/bedrock.go
  • core/providers/bedrock/chat.go
  • core/providers/cohere/chat.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/gemini_test.go
  • core/providers/xai/xai_test.go
  • transports/changelog.md
✅ Files skipped from review due to trivial changes (1)
  • core/changelog.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/cohere/chat.go
  • transports/changelog.md
  • core/providers/gemini/gemini.go
  • core/internal/llmtests/utils.go

@TejasGhatte TejasGhatte force-pushed the 03-07-fix_tool_call_indexes_in_chat_completions_stream branch from 3d25688 to 42806b1 Compare March 9, 2026 06:46
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
core/internal/llmtests/utils.go (1)

395-399: ⚠️ Potential issue | 🟠 Major

Populate ToolCallInfo.Index during extraction.

Line 399 says -1 means “not available”, but these constructors still leave the zero value, and ExtractChatToolCalls also drops the ChatAssistantMessageToolCall.Index that providers are now populating. That makes the helper unreliable for the new sequencing assertions once a stream is accumulated into a normal response.

Suggested fix
-			for _, toolCall := range choice.Message.ChatAssistantMessage.ToolCalls {
-				info := ToolCallInfo{}
+			for _, toolCall := range choice.Message.ChatAssistantMessage.ToolCalls {
+				info := ToolCallInfo{Index: int(toolCall.Index)}
 				if toolCall.ID != nil {
 					info.ID = *toolCall.ID
 				}
@@
-			info := ToolCallInfo{}
+			info := ToolCallInfo{Index: -1}
 			if output.ResponsesToolMessage.Name != nil {
 				info.Name = *output.ResponsesToolMessage.Name
 			}

Also applies to: 507-515, 533-543

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/utils.go` around lines 395 - 399, ToolCallInfo.Index
is left at the zero value instead of the intended -1 and ExtractChatToolCalls
drops the provider-populated ChatAssistantMessageToolCall.Index, breaking
sequencing assertions; update the constructors that create ToolCallInfo
instances to initialize Index = -1 when no index is available, and modify
ExtractChatToolCalls to copy the source ChatAssistantMessageToolCall.Index into
ToolCallInfo.Index (and similarly fix the other extraction helper blocks around
the alternative constructors/collectors referenced) so the index is preserved
when present and -1 otherwise.
core/providers/bedrock/chat.go (1)

313-317: ⚠️ Potential issue | 🟠 Major

Don't silently map missing Bedrock tool events to index 0.

Line 313 and Line 371 still fall back to 0 when ContentBlockIndex is missing or the start-event mapping was never recorded. That can attach a later delta to the first tool call and regress the sequencing fix whenever Bedrock emits an unexpected block sequence.

Suggested hardening
-		toolCallIdx := 0
-		if chunk.ContentBlockIndex != nil {
-			toolCallIdx = state.nextToolCallIndex
-			state.contentBlockToToolCallIdx[*chunk.ContentBlockIndex] = toolCallIdx
-			state.nextToolCallIndex++
-		}
+		if chunk.ContentBlockIndex == nil {
+			return nil, nil, false
+		}
+		toolCallIdx := state.nextToolCallIndex
+		state.contentBlockToToolCallIdx[*chunk.ContentBlockIndex] = toolCallIdx
+		state.nextToolCallIndex++
@@
-			toolCallIdx := 0
-			if chunk.ContentBlockIndex != nil {
-				toolCallIdx = state.contentBlockToToolCallIdx[*chunk.ContentBlockIndex]
-			}
+			if chunk.ContentBlockIndex == nil {
+				return nil, nil, false
+			}
+			toolCallIdx, ok := state.contentBlockToToolCallIdx[*chunk.ContentBlockIndex]
+			if !ok {
+				return nil, nil, false
+			}

Also applies to: 371-374

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/chat.go` around lines 313 - 317, The code currently
defaults toolCallIdx to 0 when chunk.ContentBlockIndex is nil or when a lookup
in state.contentBlockToToolCallIdx is missing, which can mis-associate deltas
with the first tool call; update the logic in core/providers/bedrock/chat.go
(the block managing toolCallIdx, state.contentBlockToToolCallIdx, and
state.nextToolCallIndex) to avoid silently using 0 — instead detect the missing
mapping and handle it explicitly (e.g., skip attaching the delta, log/warn and
continue, or return an error) so that you only set toolCallIdx when
chunk.ContentBlockIndex is present and found in state.contentBlockToToolCallIdx,
and only increment state.nextToolCallIndex when you created a new mapping.
core/internal/llmtests/multiple_tool_calls.go (1)

337-346: ⚠️ Potential issue | 🟠 Major

Still need to wire the timeout context into ResponsesStreamRequest.

streamCtx only bounds the local select. The request itself still starts with schemas.NewBifrostContext(ctx, schemas.NoDeadline), so a timeout or retry can leave the upstream stream running in the background and stack producers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/internal/llmtests/multiple_tool_calls.go` around lines 337 - 346, The
upstream ResponsesStreamRequest is being created with
schemas.NewBifrostContext(ctx, schemas.NoDeadline) so the stream is not bounded
by the local timeout (streamCtx) and can leak; change the code that calls
ResponsesStreamRequest inside WithResponsesStreamValidationRetry to create the
Bifrost context from the timeout-bound context (e.g., use streamCtx or
schemas.NewBifrostContext(streamCtx, ...) instead of ctx) so the request is
canceled when streamCtx times out, and keep the existing cancel() defer to
ensure cleanup in NewStreamingToolCallAccumulator/ResponsesStreamRequest flows.
🧹 Nitpick comments (3)
core/providers/bedrock/chat.go (1)

270-288: Keep the streaming converter side-effect free.

Introducing BedrockStreamState into ToBifrostChatCompletionStream makes this converter in core/providers/bedrock/chat.go depend on mutable stream state. Please keep the index bookkeeping in the stream loop and let this method stay a pure event-to-chunk mapper.

As per coding guidelines, "core/providers/*/{chat,embedding,images,speech,responses}.go: Converter functions (ToRequest and ToBifrostResponse) must be pure functions with no HTTP calls, no logging, and no side effects."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/chat.go` around lines 270 - 288, The converter
ToBifrostChatCompletionStream currently mutates and depends on
BedrockStreamState (fields nextToolCallIndex and contentBlockToToolCallIdx),
introducing side-effects; refactor so ToBifrostChatCompletionStream is pure:
remove any reads/writes to BedrockStreamState (including initialization of
contentBlockToToolCallIdx and updates to
nextToolCallIndex/contentBlockToToolCallIdx) and return only the mapped
*schemas.BifrostChatResponse and error indicator; move all index bookkeeping and
map management into the streaming loop that calls ToBifrostChatCompletionStream
(i.e., perform content-block-to-tool-call index calculations and state updates
outside the function, and pass any needed index values into the converter as
immutable parameters if necessary).
core/providers/gemini/gemini_test.go (1)

168-177: Add a provider-level multi-chunk index regression test.

These updates only thread the new GeminiStreamState parameter through existing tests. They still do not assert the actual bug fix here: reusing one shared state across multiple streamed tool-call chunks should yield indices 0, 1, .... A small unit test in this file would pin that behavior without relying on live llmtests.

Also applies to: 272-280

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/gemini/gemini_test.go` around lines 168 - 177, Add a focused
unit test that verifies multi-chunk indexing when reusing a single
GeminiStreamState across streamed tool-call chunks: create a GeminiStreamState
instance, call ToBifrostChatCompletionStream repeatedly with responses that
represent sequential tool-call chunks, collect the produced bifrost chunk
indices and assert they equal 0,1,...; update gemini_test.go to include this new
test (use GeminiStreamState, ToBifrostChatCompletionStream, and the response
fixtures) so the regression is pinned without relying on live llmtests.
core/providers/anthropic/chat.go (1)

801-816: Consider adding a defensive check for missing content-block mapping.

Line 803 performs a map lookup without checking if the key exists. If a content_block_delta arrives without a prior content_block_start for that content block index (malformed stream), the lookup silently returns 0, potentially associating the delta with the wrong tool call.

While Anthropic's streaming protocol guarantees content_block_start precedes content_block_delta, a defensive check would improve robustness against malformed streams.

🛡️ Optional defensive check
 				// Resolve which tool-call this delta belongs to via the content-block index.
-				toolCallIdx := state.contentBlockToToolCallIdx[*chunk.Index]
+				toolCallIdx, ok := state.contentBlockToToolCallIdx[*chunk.Index]
+				if !ok {
+					// No mapping found - skip this delta to avoid corrupting another tool call
+					return nil, nil, false
+				}
 
 				// Create streaming response for tool input delta
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/anthropic/chat.go` around lines 801 - 816, The map lookup
state.contentBlockToToolCallIdx[*chunk.Index] may return 0 for missing keys and
misassociate deltas; update the handler that builds streamResponse to first
check for the key (e.g., via a value,ok pattern) using chunk.Index, and if
missing handle defensively (log/warn and skip emitting this content_block_delta
or emit a safe fallback choice) instead of proceeding with a toolCallIdx of 0 —
change the logic around toolCallIdx usage in this delta-creation block so it
only uses a validated toolCallIdx when the map contains the index.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@core/internal/llmtests/multiple_tool_calls.go`:
- Around line 337-346: The upstream ResponsesStreamRequest is being created with
schemas.NewBifrostContext(ctx, schemas.NoDeadline) so the stream is not bounded
by the local timeout (streamCtx) and can leak; change the code that calls
ResponsesStreamRequest inside WithResponsesStreamValidationRetry to create the
Bifrost context from the timeout-bound context (e.g., use streamCtx or
schemas.NewBifrostContext(streamCtx, ...) instead of ctx) so the request is
canceled when streamCtx times out, and keep the existing cancel() defer to
ensure cleanup in NewStreamingToolCallAccumulator/ResponsesStreamRequest flows.

In `@core/internal/llmtests/utils.go`:
- Around line 395-399: ToolCallInfo.Index is left at the zero value instead of
the intended -1 and ExtractChatToolCalls drops the provider-populated
ChatAssistantMessageToolCall.Index, breaking sequencing assertions; update the
constructors that create ToolCallInfo instances to initialize Index = -1 when no
index is available, and modify ExtractChatToolCalls to copy the source
ChatAssistantMessageToolCall.Index into ToolCallInfo.Index (and similarly fix
the other extraction helper blocks around the alternative
constructors/collectors referenced) so the index is preserved when present and
-1 otherwise.

In `@core/providers/bedrock/chat.go`:
- Around line 313-317: The code currently defaults toolCallIdx to 0 when
chunk.ContentBlockIndex is nil or when a lookup in
state.contentBlockToToolCallIdx is missing, which can mis-associate deltas with
the first tool call; update the logic in core/providers/bedrock/chat.go (the
block managing toolCallIdx, state.contentBlockToToolCallIdx, and
state.nextToolCallIndex) to avoid silently using 0 — instead detect the missing
mapping and handle it explicitly (e.g., skip attaching the delta, log/warn and
continue, or return an error) so that you only set toolCallIdx when
chunk.ContentBlockIndex is present and found in state.contentBlockToToolCallIdx,
and only increment state.nextToolCallIndex when you created a new mapping.

---

Nitpick comments:
In `@core/providers/anthropic/chat.go`:
- Around line 801-816: The map lookup
state.contentBlockToToolCallIdx[*chunk.Index] may return 0 for missing keys and
misassociate deltas; update the handler that builds streamResponse to first
check for the key (e.g., via a value,ok pattern) using chunk.Index, and if
missing handle defensively (log/warn and skip emitting this content_block_delta
or emit a safe fallback choice) instead of proceeding with a toolCallIdx of 0 —
change the logic around toolCallIdx usage in this delta-creation block so it
only uses a validated toolCallIdx when the map contains the index.

In `@core/providers/bedrock/chat.go`:
- Around line 270-288: The converter ToBifrostChatCompletionStream currently
mutates and depends on BedrockStreamState (fields nextToolCallIndex and
contentBlockToToolCallIdx), introducing side-effects; refactor so
ToBifrostChatCompletionStream is pure: remove any reads/writes to
BedrockStreamState (including initialization of contentBlockToToolCallIdx and
updates to nextToolCallIndex/contentBlockToToolCallIdx) and return only the
mapped *schemas.BifrostChatResponse and error indicator; move all index
bookkeeping and map management into the streaming loop that calls
ToBifrostChatCompletionStream (i.e., perform content-block-to-tool-call index
calculations and state updates outside the function, and pass any needed index
values into the converter as immutable parameters if necessary).

In `@core/providers/gemini/gemini_test.go`:
- Around line 168-177: Add a focused unit test that verifies multi-chunk
indexing when reusing a single GeminiStreamState across streamed tool-call
chunks: create a GeminiStreamState instance, call ToBifrostChatCompletionStream
repeatedly with responses that represent sequential tool-call chunks, collect
the produced bifrost chunk indices and assert they equal 0,1,...; update
gemini_test.go to include this new test (use GeminiStreamState,
ToBifrostChatCompletionStream, and the response fixtures) so the regression is
pinned without relying on live llmtests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 47d9c4ef-daeb-42a1-bf31-b70fd1198c81

📥 Commits

Reviewing files that changed from the base of the PR and between 3d25688 and 42806b1.

📒 Files selected for processing (14)
  • core/changelog.md
  • core/internal/llmtests/multiple_tool_calls.go
  • core/internal/llmtests/tool_calls_streaming.go
  • core/internal/llmtests/utils.go
  • core/providers/anthropic/anthropic.go
  • core/providers/anthropic/chat.go
  • core/providers/bedrock/bedrock.go
  • core/providers/bedrock/chat.go
  • core/providers/cohere/chat.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/gemini_test.go
  • core/providers/xai/xai_test.go
  • transports/changelog.md
✅ Files skipped from review due to trivial changes (1)
  • core/changelog.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • transports/changelog.md
  • core/providers/bedrock/bedrock.go

Copy link
Collaborator Author

TejasGhatte commented Mar 9, 2026

Merge activity

  • Mar 9, 7:42 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Mar 12, 5:12 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Mar 12, 5:12 AM UTC: @akshaydeo merged this pull request with Graphite.

@TejasGhatte TejasGhatte force-pushed the 03-07-fix_tool_call_indexes_in_chat_completions_stream branch from 42806b1 to 9d2b76c Compare March 12, 2026 05:10
@akshaydeo akshaydeo merged commit a0cd755 into main Mar 12, 2026
7 of 9 checks passed
@akshaydeo akshaydeo deleted the 03-07-fix_tool_call_indexes_in_chat_completions_stream branch March 12, 2026 05:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Anthropic invalid tools calling index

4 participants