Skip to content

Enable TCP_DEFER_ACCEPT for HTTP listeners on Linux#28617

Open
Jarred-Sumner wants to merge 2 commits intomainfrom
claude/tcp-defer-accept
Open

Enable TCP_DEFER_ACCEPT for HTTP listeners on Linux#28617
Jarred-Sumner wants to merge 2 commits intomainfrom
claude/tcp-defer-accept

Conversation

@Jarred-Sumner
Copy link
Copy Markdown
Collaborator

What

Sets TCP_DEFER_ACCEPT on the listen socket for Bun.serve() (Linux) and SO_ACCEPTFILTER "dataready" (FreeBSD). When the kernel defers accept, the accept loop dispatches the socket as readable immediately instead of returning to epoll first.

Why

For short-lived connections (HTTP/1.1 with Connection: close), the server currently does:

  1. epoll wake → listen socket readable
  2. accept() → register new socket with epoll
  3. return to epoll ← wasted round-trip
  4. epoll wake → new socket readable
  5. recv() → process → respond → close

With TCP_DEFER_ACCEPT, the kernel holds the accept until data arrives, so step 3–4 collapses into step 2. Same pattern nginx uses (rev->ready = 1 after deferred accept).

Works for TLS too — the ClientHello is the first data packet.

Scope

  • Opt-in via LIBUS_LISTEN_DEFER_ACCEPT, only set by uWS HttpContext::listen
  • Bun.listen() / net.createServer() unchanged (may serve server-sends-first protocols)
  • No-op on macOS (no kernel equivalent) and Windows

Test plan

  • test/js/bun/http/serve.test.ts passes (189/189)
  • zig:check-all passes on all platforms
  • Benchmark oha --disable-keepalive on Linux to quantify improvement

For short-lived HTTP/1.1 connections (Connection: close), the server
currently has to return to epoll after accept() just to learn the
request is readable. With TCP_DEFER_ACCEPT, the kernel holds the accept
until the client has actually sent data, so the request (or ClientHello
for TLS) is guaranteed to be in the buffer when accept() returns.

The accept loop now dispatches readable immediately for sockets from a
deferred listener, skipping one epoll round-trip per connection. This is
the same pattern nginx uses (rev->ready = 1 after deferred accept).

The option is opt-in via LIBUS_LISTEN_DEFER_ACCEPT and only set by the
uWS HTTP listener — raw TCP via Bun.listen/net.createServer is unchanged
since those may serve protocols where the server sends first.

SO_ACCEPTFILTER "dataready" is set on FreeBSD for the same effect. No-op
on macOS and Windows.
@robobun
Copy link
Copy Markdown
Collaborator

robobun commented Mar 28, 2026

Updated 2:13 AM PT - Mar 28th, 2026

@robobun, your commit aab3cc9 has 3 failures in Build #42529 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 28617

That installs a local version of the PR into your bun-28617 executable, so you can run:

bun-28617 --bun

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e6d20394-dce1-4cbc-adae-ddac26455f07

📥 Commits

Reviewing files that changed from the base of the PR and between 242f480 and aab3cc9.

📒 Files selected for processing (1)
  • src/bun.js/bindings/FormatStackTraceForJS.cpp

Walkthrough

Added a TCP defer-accept implementation (platform-specific setsockopt + listen/accept handling) and wiring to enable it for HTTP listeners; also changed JS stack-trace setup to eagerly set .stack when error info is already materialized, otherwise preserve lazy accessor behavior.

Changes

Cohort / File(s) Summary
Defer-accept implementation
packages/bun-usockets/src/bsd.c, packages/bun-usockets/src/internal/networking/bsd.h
Added bsd_set_defer_accept(listenFd) and its declaration to apply TCP_DEFER_ACCEPT or SO_ACCEPTFILTER depending on platform.
Listen socket state
packages/bun-usockets/src/internal/internal.h
Added unsigned char deferred_accept to struct us_listen_socket_t to record successful defer-accept configuration.
Listen options / API
packages/bun-usockets/src/libusockets.h
Added LIBUS_LISTEN_DEFER_ACCEPT = 64 listen option to public enum.
Listen initialization
packages/bun-usockets/src/context.c
Initialized ls->deferred_accept = 0 and call bsd_set_defer_accept(listen_socket_fd) when LIBUS_LISTEN_DEFER_ACCEPT is set.
Event-loop accept path
packages/bun-usockets/src/loop.c
When listen_socket->deferred_accept is set, immediately dispatch a LIBUS_SOCKET_READABLE poll for the newly accepted socket to process readable data without waiting for a later event-loop round.
HTTP integration
packages/bun-uws/src/HttpContext.h
Or-in LIBUS_LISTEN_DEFER_ACCEPT into listener options used by HttpContext<SSL>::listen.
JS stack-trace behavior
src/bun.js/bindings/FormatStackTraceForJS.cpp
When an ErrorInstance already has materialized error info, eagerly compute and set the .stack property; otherwise keep previous lazy accessor installation path.
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: enabling TCP_DEFER_ACCEPT for HTTP listeners on Linux, which is the primary objective of this PR.
Description check ✅ Passed The description covers both required sections: 'What' explains the feature with platform details, and 'Why' provides clear rationale with nginx comparison. Implementation scope and test plan are also included.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/bun-usockets/src/bsd.c`:
- Around line 990-1006: The strcpy call in bsd_set_defer_accept uses a fixed
literal for afa.af_name; replace it with a bounded copy to avoid any future
overflow risk (use a safe alternative like strncpy/strlcpy or memcpy) when
populating struct accept_filter_arg afa.af_name, keeping the zeroed memset and
preserving the "dataready" value and sizeof(afa) passed to setsockopt; ensure
the chosen function always NUL-terminates afa.af_name after copy.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: cfb2b43b-24d5-417c-a6f2-c221b4602182

📥 Commits

Reviewing files that changed from the base of the PR and between 2e610b1 and 242f480.

📒 Files selected for processing (7)
  • packages/bun-usockets/src/bsd.c
  • packages/bun-usockets/src/context.c
  • packages/bun-usockets/src/internal/internal.h
  • packages/bun-usockets/src/internal/networking/bsd.h
  • packages/bun-usockets/src/libusockets.h
  • packages/bun-usockets/src/loop.c
  • packages/bun-uws/src/HttpContext.h

Comment on lines +990 to +1006
int bsd_set_defer_accept(LIBUS_SOCKET_DESCRIPTOR listenFd) {
#if defined(TCP_DEFER_ACCEPT)
/* nginx uses 1 second here: there's no way to know how long a connection sat in the
* defer queue (or whether it bypassed it via syncookies), so a short timeout lets the
* application's own idle timeout take over for clients that connect but never send. */
int timeout = 1;
return setsockopt(listenFd, IPPROTO_TCP, TCP_DEFER_ACCEPT, &timeout, sizeof(timeout)) == 0;
#elif defined(SO_ACCEPTFILTER)
struct accept_filter_arg afa;
memset(&afa, 0, sizeof(afa));
strcpy(afa.af_name, "dataready");
return setsockopt(listenFd, SOL_SOCKET, SO_ACCEPTFILTER, &afa, sizeof(afa)) == 0;
#else
(void) listenFd;
return 0;
#endif
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Confirm bsd_set_defer_accept is used in context.c

rg -n 'bsd_set_defer_accept' --type c

Repository: oven-sh/bun

Length of output: 377


🏁 Script executed:

rg -n 'struct accept_filter_arg' --type c -A 5

Repository: oven-sh/bun

Length of output: 493


🏁 Script executed:

head -50 packages/bun-usockets/src/bsd.c | grep -n "include"

Repository: oven-sh/bun

Length of output: 226


🏁 Script executed:

sed -n '990,1010p' packages/bun-usockets/src/bsd.c

Repository: oven-sh/bun

Length of output: 1013


🏁 Script executed:

# Check if there are any system header references or struct definitions visible
rg -n "accept_filter_arg" packages/bun-usockets/src/ -B 5 -A 5

Repository: oven-sh/bun

Length of output: 1004


🏁 Script executed:

# Look for any BSD-specific includes or defines around SO_ACCEPTFILTER
sed -n '950,1010p' packages/bun-usockets/src/bsd.c | head -70

Repository: oven-sh/bun

Length of output: 2202


🌐 Web query:

accept_filter_arg struct af_name buffer size BSD FreeBSD

💡 Result:

The struct accept_filter_arg in FreeBSD (and BSD systems like NetBSD) is defined in sys/sys/socket.h as: struct accept_filter_arg { char af_name[16]; char af_arg[256-16]; }; The af_name field is a fixed-size char array buffer of 16 bytes for the accept filter name. The af_arg field is a char array of 240 bytes (256 - 16 = 240) for filter-specific arguments. The total struct size is 256 bytes. This structure is used with setsockopt(2) and SO_ACCEPTFILTER to attach an accept filter to a listening socket. Examples from accf_http(9) and accf_dns(9) man pages show usage like: struct accept_filter_arg afa; bzero(&afa, sizeof(afa)); strcpy(afa.af_name, "httpready"); // or "dnsready" setsockopt(sok, SOL_SOCKET, SO_ACCEPTFILTER, &afa, sizeof(afa)); The accept filter mechanism originated in FreeBSD 4.0 and is used to preprocess incoming connections in the kernel before passing to accept(2). It remains in current FreeBSD versions (e.g., FreeBSD 15/16 as of 2024 man pages).

Citations:


Implementation is correct; consider using safer string copy for defensive coding.

The function correctly handles:

  • Linux with TCP_DEFER_ACCEPT (1-second timeout matching nginx)
  • FreeBSD with SO_ACCEPTFILTER "dataready"
  • Other platforms return 0 (feature disabled)

The strcpy on line 1000 is safe because afa is zeroed and "dataready" (9 characters + null) fits in af_name (16 bytes). However, for defense-in-depth:

💡 Optional: Use bounded string copy
 `#elif` defined(SO_ACCEPTFILTER)
     struct accept_filter_arg afa;
     memset(&afa, 0, sizeof(afa));
-    strcpy(afa.af_name, "dataready");
+    strncpy(afa.af_name, "dataready", sizeof(afa.af_name) - 1);
     return setsockopt(listenFd, SOL_SOCKET, SO_ACCEPTFILTER, &afa, sizeof(afa)) == 0;

The function is properly called from context.c:406 when LIBUS_LISTEN_DEFER_ACCEPT is set.

🧰 Tools
🪛 Cppcheck (2.20.0)

[style] 990-990: The function 'bsd_set_defer_accept' is never used.

(unusedFunction)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/bun-usockets/src/bsd.c` around lines 990 - 1006, The strcpy call in
bsd_set_defer_accept uses a fixed literal for afa.af_name; replace it with a
bounded copy to avoid any future overflow risk (use a safe alternative like
strncpy/strlcpy or memcpy) when populating struct accept_filter_arg afa.af_name,
keeping the zeroed memset and preserving the "dataready" value and sizeof(afa)
passed to setsockopt; ensure the chosen function always NUL-terminates
afa.af_name after copy.

@github-actions
Copy link
Copy Markdown
Contributor

Found 7 issues this PR may fix:

  1. uWebSockets.js is faster by about ~3000 req/s than Bun.serve #8643 - uWebSockets.js is faster by ~3000 req/s than Bun.serve; TCP_DEFER_ACCEPT reduces accept() syscalls and event loop iterations for better throughput
  2. Wildly bad networking performance with fetch in bun vs nodejs when switching between similar endpoints #9034 - Wildly bad networking performance with fetch vs nodejs; eliminates wasted cycles on empty accepts and reduces HTTP request latency
  3. Serious latency issues in Bun #26139 - Serious latency issues in Bun; reduces time between accept() and first data availability, lowering request processing latency
  4. Bun feature request: per-worker pause/resume of accepts on shared sockets #22016 - Bun feature request: per-worker pause/resume of accepts on shared sockets; reduces frequency of accept events for more efficient load balancing across workers
  5. [1.2.11] - performance drop on this release #19363 - Performance drop in version 1.2.11; could mitigate performance issues if they're related to increased socket handling overhead
  6. BUN 182x SLOWER than NodeJS with MSSQL consecutive Pool Requests #13093 - BUN 182x SLOWER than NodeJS with MSSQL consecutive Pool Requests; reducing accept overhead could help if slowdown involves HTTP API calls
  7. TCP Socket context data is unexpectedly shared among all sockets #25357 - TCP Socket context data is unexpectedly shared among all sockets; more efficient socket handling could reduce context sharing problems

If this is helpful, consider adding Fixes #<number> to the PR description to auto-close the issue on merge.

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR makes a meaningful behavioral change to how all HTTP connections are accepted in Bun, touching low-level networking code across 7 files — it warrants a human review given the complexity of the recursive dispatch path and the unverified SSL code flow.

Extended reasoning...

Overview

The PR adds TCP_DEFER_ACCEPT (Linux) / SO_ACCEPTFILTER (FreeBSD) to all HTTP listen sockets via a new LIBUS_LISTEN_DEFER_ACCEPT flag. Changes span: a new bsd_set_defer_accept() helper in bsd.c, a deferred_accept field added to us_listen_socket_t, flag propagation in context.c, and an immediate recursive call to us_internal_dispatch_ready_poll() inside the accept loop in loop.c. HttpContext::listen unconditionally OR-s the new flag, affecting every Bun.serve() HTTP/HTTPS server.

Security risks

No direct security risks. TCP_DEFER_ACCEPT with a 1-second timeout (matching nginx) is a well-understood kernel feature. The timeout prevents SYN-flood half-open connections from occupying queue slots indefinitely. No auth, crypto, or permission surface is touched.

Level of scrutiny

High scrutiny is warranted. This is not a config or comment change — it changes the accept-loop control flow for every HTTP server in Bun. Key concerns: (1) The recursive us_internal_dispatch_ready_poll() call from within the POLL_TYPE_SEMI_SOCKET branch adds a new re-entrancy pattern that needs careful review. (2) The SSL code path (us_internal_ssl_socket_context_listen) is not shown in the diff; it is unclear whether the deferred_accept flag is correctly propagated to the underlying listen socket for TLS servers. (3) The interaction with the low-priority SSL handshake queue and socket adoption (flags.adopted / s->prev) after the deferred dispatch adds complexity. (4) The benchmark (oha --disable-keepalive) is explicitly listed as not yet done in the test plan.

Other factors

No bugs were found by the automated system, and the 189/189 serve.test.ts pass rate is reassuring. The implementation mirrors nginx's approach (rev->ready = 1) and the comments are thorough. However, given the scope (all HTTP servers on Linux/FreeBSD), the unfinished benchmark, and the un-audited SSL path, a human reviewer should sign off before this ships.

Error.captureStackTrace called setStackFrames() after
materializeErrorInfoIfNeeded(), leaving m_errorInfoMaterialized=true
with a non-null m_stackTrace. When GC's finalizeUnconditionally found
unmarked frames it called computeErrorInfo() which asserts
!m_errorInfoMaterialized, causing SIGILL under ASAN.

When error info is already materialized, eagerly compute and set the
.stack property instead of storing new frames via setStackFrames().
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR bundles two unrelated changes — a TCP_DEFER_ACCEPT networking optimization and a JSC GC-assertion fix — both of which touch critical, production-path code. A human reviewer should validate the recursive us_internal_dispatch_ready_poll call in the accept loop and the hasMaterializedErrorInfo branch in errorConstructorFuncCaptureStackTrace before merging.

Extended reasoning...

Overview

The PR modifies eight files across two independent concerns: (1) enabling TCP_DEFER_ACCEPT / SO_ACCEPTFILTER on HTTP listen sockets to skip an unnecessary epoll round-trip on accept, and (2) fixing a JSC GC assertion that fires when Error.captureStackTrace is called on an already-materialized ErrorInstance.

Security risks

No apparent security risks. TCP_DEFER_ACCEPT is opt-in and scoped exclusively to HttpContext::listen; Bun.listen() and net.createServer() are not affected. The flag is gated by successful setsockopt() return, so a kernel that does not support it degrades gracefully. The JSC change is internal error bookkeeping with no trust-boundary implications.

Level of scrutiny

Both changes warrant careful human review despite no bugs being found:

Networking (loop.c): After on_open fires, the code now makes a recursive call to us_internal_dispatch_ready_poll with LIBUS_SOCKET_READABLE. This re-enters the same dispatch function under a different poll type (POLL_TYPE_SOCKET vs POLL_TYPE_SEMI_SOCKET), which is architecturally unusual. The comment acknowledges EWOULDBLOCK as a rare-but-possible outcome (defer timeout). Under high connection rates, the stack depth of this recursion needs human validation.

JSC (FormatStackTraceForJS.cpp): The fix introduces a new branch for hasMaterializedErrorInfo() that eagerly computes and sets the stack property. This path bypasses the lazy-getter mechanism that the rest of the error-stack machinery relies on. While the logic appears correct for preventing the GC assertion, JSC internals around ErrorInstance lifecycle are subtle enough to warrant human sign-off.

Other factors

The benchmark item in the test plan is unchecked ('[ ] Benchmark oha --disable-keepalive on Linux'). The two changes are conceptually unrelated and would be easier to review independently. CodeRabbit's only comment was a trivial strcpy nitpick (safe as written). No prior human or bot reviews exist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants