Skip to content

Fix/singleflight body duplication#704

Open
cdaguerre wants to merge 2 commits intodarkweak:masterfrom
cdaguerre:fix/singleflight-body-duplication
Open

Fix/singleflight body duplication#704
cdaguerre wants to merge 2 commits intodarkweak:masterfrom
cdaguerre:fix/singleflight-body-duplication

Conversation

@cdaguerre
Copy link
Contributor

Problem

When a client cancels a request that is being coalesced via singleflight, subsequent responses can contain duplicated or corrupted bodies — two HTTP response bodies concatenated together.

Root Cause

Three related bugs in pkg/middleware/middleware.go:

1. Missing Buf.Reset() before writing singleflight result

After singleflight.Do() returns, the post-Do code writes sfWriter.body back into customWriter.Buf without resetting it first. Since next() already wrote the body during the Do callback, the body gets appended a second time.

This was partially fixed in the Traefik override in f1195d5 but never propagated to core.

2. bytes.Buffer.Bytes() slice alias corruption

The singleflight value captures customWriter.Buf.Bytes(), which returns a slice pointing to the buffer's internal array — not a copy. When the buffer is reset (via defer) or returned to sync.Pool and reused, the shared slice contents are corrupted for coalesced requests still reading it.

3. errorCacheCh data race on cancellation

ServeHTTP creates an unbuffered channel and defers close(). When the client cancels, the deferred close fires while the orphaned goroutine running Upstream later tries to send on the closed channel. Additionally, defer s.bufPool.Put(bufPool) returns the buffer to the pool while the goroutine may still be writing to it.

Fix

Bug Fix
Double-write Add customWriter.Buf.Reset() before writing sfWriter.body in both Upstream and Revalidate
Slice alias Copy buffer contents into an independent []byte before returning from the singleflight callback
Channel race Use a buffered channel (capacity 1), remove defer close() and the recover wrapper
Buffer pool race Track buffer ownership — only return to pool when the goroutine has finished (via the errorCacheCh receive), intentionally leak on cancellation so the goroutine can finish safely

Tests

5 tests added in the prior commit, covering all three bugs:

Test Verifies
TestSingleflightWinnerWritesBodyTwice Body not duplicated for the winner request
TestBufferPoolReuseAfterCancellation No race on errorCacheCh after cancellation
TestCoalescedRequestGetsCleanBody No concurrent buffer write race between coalesced requests
TestCancelledRequestDoesNotCorruptCoalescedResponse No race when one of two concurrent requests is cancelled
TestBufferBytesSliceCorruption Demonstrates Buffer.Bytes() slice alias corruption deterministically

All pass with go test -race -count=3.

Related

  • f1195d5 — partial fix applied only to `plugins/traefik/override/middleware/middleware.go

…pool races

Add tests to pkg/middleware that reproduce three related bugs in the
request coalescing (singleflight) and buffer pool lifecycle:

1. TestSingleflightWinnerWritesBodyTwice: The singleflight winner gets
   its body written into the CustomWriter buffer by next() inside the
   callback, then the post-Do code unconditionally writes sfWriter.body
   back into the same buffer. The buffer reset via defer and slice alias
   behavior currently masks this for solo requests, but the pattern is
   unsafe.

2. TestBufferPoolReuseAfterCancellation: When a client cancels a request,
   ServeHTTP returns and defer s.bufPool.Put(bufPool) returns the buffer
   to the pool while the orphaned upstream goroutine is still writing to
   it. A subsequent request can pick up the same buffer. Run with -race
   to detect the data race on errorCacheCh (defer close races with the
   goroutine sending on it).

3. TestCoalescedRequestGetsCleanBody: Two concurrent requests coalesced
   via singleflight both call Upstream which writes sfWriter.body into
   customWriter.Buf. Run with -race to detect the concurrent
   bytes.Buffer.Write race.

4. TestCancelledRequestDoesNotCorruptCoalescedResponse: Request A and B
   are coalesced, A is cancelled mid-flight. Run with -race to detect the
   errorCacheCh race between the cancelled request's deferred close and
   the orphaned goroutine.

5. TestBufferBytesSliceCorruption: Unit test demonstrating that
   bytes.Buffer.Bytes() returns a slice alias (not a copy) that gets
   corrupted when the buffer is returned to sync.Pool and reused. This is
   the mechanism underlying the singleflight value body corruption.

These tests pass without -race (the races are timing-dependent) but
reliably fail with -race, confirming the bugs exist in production code.
Three related bugs caused responses to contain duplicated or corrupted
bodies when requests were coalesced via singleflight:

1. Missing Buf.Reset() before writing singleflight result: after
   singleflight.Do() returns, the post-Do code wrote sfWriter.body into
   customWriter.Buf without resetting it first, causing the body to be
   appended a second time. Fixed in both Upstream and Revalidate methods.

2. bytes.Buffer.Bytes() slice alias in singleflight value: the returned
   singleflightValue captured a slice pointing to the buffer's internal
   array (not a copy). When the buffer was reset or returned to the pool,
   shared callers received corrupted data. Fixed by copying the buffer
   contents into an independent slice before returning.

3. errorCacheCh data race on cancellation: the unbuffered channel with
   defer close() raced with the goroutine's send when the request context
   was cancelled. Fixed by using a buffered channel (capacity 1) and
   removing the defer close. Also fixed the buffer pool lifecycle: the
   buffer is no longer returned to the pool when the goroutine is still
   running (on context cancellation).
@netlify
Copy link

netlify bot commented Feb 6, 2026

Deploy Preview for teal-sprinkles-4c7f14 canceled.

Name Link
🔨 Latest commit cae09ad
🔍 Latest deploy log https://app.netlify.com/projects/teal-sprinkles-4c7f14/deploys/6985f65c9da51600073e5d38

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant