Skip to content

http: support custom socket-producing agents (tunnel) in node:http client#31798

Closed
robobun wants to merge 2 commits into
mainfrom
farm/c7cb06d7/http-tunnel-agent
Closed

http: support custom socket-producing agents (tunnel) in node:http client#31798
robobun wants to merge 2 commits into
mainfrom
farm/c7cb06d7/http-tunnel-agent

Conversation

@robobun

@robobun robobun commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

What

Makes node:http / node:https honor custom socket-producing agents — in particular the tunnel package, which establishes an HTTP CONNECT tunnel through a proxy.

Fixes #31795.

Repro

proxy.mjs + client.mjs from the issue (a tunnel.httpsOverHttp() agent pointed at a local proxy):

  • Before: the proxy never received a CONNECT; Bun connected directly to example.com:443 and returned 200/403, ignoring the agent.
  • After (matches Node): Bun sends CONNECT example.com:443 HTTP/1.1 to the proxy, and when the proxy replies 502 the request errors with tunneling socket could not be established, statusCode=502.

Verified byte-for-byte against Node v24 (the CONNECT line, headers, and the error all match).

Cause

ClientRequest dispatched every request through the native fetch (nodeHttpClient) and only ever read agent.proxy as an href (for https-proxy-agent / proxy-agents). A user agent's addRequest / createSocket / onSocket hooks were never invoked, so tunnel — which performs a CONNECT by overriding addRequest and calling req.onSocket(socket) with the tunneled socket — was bypassed entirely.

Fix

In src/js/node/_http_client.ts, detect when an agent overrides addRequest and exposes no .proxy href (so https-proxy-agent / http-proxy-agent keep using the native fetch proxy path), and drive the request over the socket the agent produces instead of fetch. This mirrors node:http's socket path:

  • call agent.addRequest(this, options) at the end of the constructor;
  • implement ClientRequest.onSocket(socket, err)tickOnSocket / socketOnData / parserOnIncomingClient, wiring an HTTPParser (RESPONSE mode) and the socket listeners, then writing the request line + headers + body over the socket;
  • add IncomingMessage._addHeaderLines / _addHeaderLine so the parser can build the response headers for the socket-driven path.

tunnel's inner CONNECT request (to the proxy) relies on the HTTP CONNECT method support added in #31574, whose commits this branch includes. If #31574 lands first I'll rebase this down to just the custom-agent layer.

Tests

test/js/node/http/node-http.test.ts — added to the existing node:http suite (runs in both Bun and Node):

  • tunnel.httpsOverHttp() sends a CONNECT to the proxy and surfaces the proxy's 502 (the exact issue scenario);
  • tunnel.httpOverHttp() tunnels a request end-to-end through a real CONNECT proxy and receives the target's response, asserting the proxy's connect handler actually ran (so a direct connection can't pass the test).

Both fail on the released binary and pass with this change; behavior was compared directly against Node v24.

…ient

node:http / node:https dispatched every request through native fetch and
never invoked a user agent's addRequest/createSocket/onSocket hooks. Agents
like the `tunnel` package — which establish an HTTP CONNECT tunnel by
overriding addRequest and handing back the tunneled socket via
req.onSocket() — were silently ignored, so Bun connected directly to the
target and bypassed the proxy entirely.

Detect when an agent overrides addRequest (and exposes no `.proxy` href, so
https-proxy-agent / http-proxy-agent keep using the native fetch proxy path)
and drive the request over the socket the agent produces, speaking HTTP/1.x
with the existing HTTPParser the way node:http does (tickOnSocket /
socketOnData / parserOnIncomingClient). Adds ClientRequest.onSocket() and
IncomingMessage._addHeaderLines/_addHeaderLine for the socket-driven path.

Builds on the CONNECT-method support from #31574, which the tunnel package's
inner CONNECT request relies on.
@robobun

robobun commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 2:20 AM PT - Jun 4th, 2026

@autofix-ci[bot], your commit 2ffde14 has 2 failures in Build #60289 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 31798

That installs a local version of the PR into your bun-31798 executable, so you can run:

bun-31798 --bun

@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 503b4667-363d-4a36-b37b-c2b3fc89cdfe

📥 Commits

Reviewing files that changed from the base of the PR and between a3464c6 and 2ffde14.

⛔ Files ignored due to path filters (1)
  • test/bun.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • src/js/internal/http.ts
  • src/js/node/_http_client.ts
  • src/js/node/_http_incoming.ts
  • test/js/node/http/node-http-connect.test.ts
  • test/js/node/http/node-http.test.ts
  • test/package.json

Walkthrough

This PR adds HTTP CONNECT proxy support to Bun's Node.js HTTP client. The implementation enables custom agents (like tunnel) to supply their own sockets for proxied connections, includes full CONNECT request/response negotiation, and is validated with comprehensive tests using the tunnel library.

Changes

HTTP CONNECT Proxy Support

Layer / File(s) Summary
Custom Socket Path Symbol Definition
src/js/internal/http.ts
Introduces kCustomSocketPath Symbol to distinguish custom agent-produced sockets from the native fetch path.
IncomingMessage Header Construction
src/js/node/_http_incoming.ts
Extends IncomingMessage to initialize headers/rawHeaders and adds _addHeaderLines/_addHeaderLine methods to parse flat HTTPParser header arrays with set-cookie aggregation and duplicate folding.
HTTP Client Infrastructure Imports
src/js/node/_http_client.ts
Imports Agent, parser utilities, and internal symbols to enable custom socket dispatch and CONNECT handling.
Socket-Driven Request/Response Path
src/js/node/_http_client.ts
Implements socket event handlers, parser wiring, request head serialization, and _flushToSocket() to write request data directly to agent-produced sockets, including onSocketNT attachment logic.
HTTP CONNECT Proxy Implementation
src/js/node/_http_client.ts
Adds startConnect() to perform proxy CONNECT negotiation over raw TCP/TLS: parses status and headers, builds IncomingMessage, emits connect event, and enforces header limits. Updates startFetch() to dispatch CONNECT requests to this new path.
ClientRequest Constructor & Public API
src/js/node/_http_client.ts
Detects custom agent sockets by checking agent.addRequest override, calls agent.addRequest() immediately, and adds onSocket(socket, err) public method for agents to deliver sockets. Integrates with send() to use _flushToSocket() for custom socket paths.
HTTP CONNECT Test Suite
test/js/node/http/node-http-connect.test.ts
Comprehensive test block covering CONNECT semantics: tunnel byte echo, header folding, head delivery, buffering, error/close ordering, parse/size validation, destroy handling, and custom DNS lookup for proxies.
Tunnel Library Integration Tests
test/js/node/http/node-http.test.ts, test/package.json
Adds integration tests for tunnel.httpsOverHttp() and tunnel.httpOverHttp() via HTTP CONNECT proxies, plus tunnel 0.0.6 dependency.

Possibly related issues

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding support for custom socket-producing agents (like tunnel) in the node:http client.
Description check ✅ Passed The description comprehensively covers both required sections: What (the feature, fixes, repro, cause, and fix) and How (detailed test descriptions with verification against Node v24).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

This PR may be a duplicate of:

  1. http: support CONNECT method in node:http client #31574 - Implements the same HTTP CONNECT method support and socket-driven client path; http: support custom socket-producing agents (tunnel) in node:http client #31798 explicitly includes http: support CONNECT method in node:http client #31574's commits and builds the custom-agent layer on top of it
  2. node:http: rewrite the client on net/tls + llhttp (Node http suite: ~55% → 82.3%, +182 vendored tests, client proxy support) #31587 - Broader node:http client rewrite on net/tls + llhttp that subsumes this PR's functionality, including CONNECT tunneling, custom agent support, and client proxy support

🤖 Generated with Claude Code

@robobun

robobun commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator Author

Closing in favor of #31587, which rewrites the whole node:http client as a faithful port of Node's lib/_http_client.js on net/tls + llhttp. That PR drives requests through the real upstream Agent.addRequestcreateSocketreq.onSocket() path (_http_client.ts L335 + onSocket/tickOnSocket/_flush), which is exactly the mechanism the tunnel package needs — so it fixes #31795 as part of a much broader, maintainer-authored port (also subsuming the CONNECT work in #31574).

This PR was a narrower, targeted layer on top of #31574's CONNECT support; it's redundant next to the full port. I've offered the tunnel.httpsOverHttp() / httpOverHttp() regression tests from here to #31587 so the exact issue scenario stays covered.

Comment on lines +387 to +391
if (!hasContentLength && !hasTransferEncoding) {
const chunks = req[kBodyChunks];
let bodyLength = 0;
if (chunks) for (let i = 0; i < chunks.length; i++) bodyLength += chunks[i].length;
if (bodyLength > 0) head += `Content-Length: ${bodyLength}${CRLF}`;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The auto-computed Content-Length sums chunks[i].length, but body chunks buffered here can be JS strings (write_() keeps UTF-8 strings as-is), and String.length counts UTF-16 code units rather than the UTF-8 bytes that socket.write(chunk) will emit. A POST through a tunnel agent with a non-ASCII body (e.g. req.write('héllo')) therefore sends Content-Length: 5 followed by 6 bytes, producing a malformed request — use Buffer.byteLength(chunks[i]) instead.

Extended reasoning...

What the bug is

In the new socket-driven custom-agent path, buildRequestHead() auto-fills Content-Length when the user hasn't set one and the body has been buffered before the socket arrived. It computes the length as:

if (chunks) for (let i = 0; i < chunks.length; i++) bodyLength += chunks[i].length;

But entries in req[kBodyChunks] can be JavaScript strings, not just Buffers. For a string, .length is the number of UTF-16 code units, which for any non-ASCII content is smaller than the number of bytes socket.write(chunk) will actually emit (since socket.write defaults to UTF-8 encoding when no encoding is passed).

How a string ends up in kBodyChunks

In write_():

const canSkipReEncodingData =
  (typeof chunk === "string" && (encoding === "utf-8" || encoding === "utf8" || !encoding)) ||
  ($isTypedArrayView(chunk) && (...));
if (!canSkipReEncodingData) {
  chunk = Buffer.from(chunk, encoding);
}

So a UTF-8 string chunk is left as a string. Then pushChunk() for the custom-socket path does:

if (this[kCustomSocketPath]) {
  if (this[kHandle]) { ... } else {
    this[kBodyChunks].push(chunk);   // raw string buffered
  }
  return;
}

Since agent.addRequest() initiates an async CONNECT and onSocket() fires on a later tick, this[kHandle] is not yet set when the user calls req.write(...) / req.end(...) synchronously, so the string is buffered. When the socket arrives, _flushToSocket() calls buildRequestHead(this) (which sums .length) and then socket.write(chunks[i]) with no encoding argument — i.e. UTF-8.

Step-by-step proof

  1. const agent = tunnel.httpOverHttp({ proxy: ... })
  2. const req = http.request({ method: 'POST', host, port, path: '/', agent })kCustomSocketPath = true, agent.addRequest() kicks off an async CONNECT.
  3. req.write('héllo')canSkipReEncodingData is true (string, default encoding), chunk stays 'héllo'; kHandle is unset, so kBodyChunks = ['héllo'].
  4. req.end()finished = true; kHandle still unset, so nothing is flushed yet.
  5. Tunnel completes, agent calls req.onSocket(socket)onSocketNTtickOnSocketreq._flushToSocket().
  6. buildRequestHead() sees no user Content-Length/Transfer-Encoding, computes bodyLength = 'héllo'.length === 5, and writes Content-Length: 5\r\n.
  7. _flushToSocket() then calls socket.write('héllo') which emits the UTF-8 bytes 68 c3 a9 6c 6c 6f6 bytes.
  8. The server receives a request whose declared body length is 5 but whose body is 6 bytes.

Why nothing prevents it

The only re-encoding happens in write_(), and it deliberately skips UTF-8 strings. buildRequestHead() has no special-casing for string chunks, and _flushToSocket() writes them with the default (UTF-8) encoding. Node.js's OutgoingMessage uses Buffer.byteLength() for exactly this reason.

Impact

The HTTP request is malformed: the server reads 5 bytes of body and then sees a stray byte. Depending on the server it will truncate the body (data corruption), reject the request with a 400, or — on a keep-alive connection — try to parse the trailing byte as the start of the next request line (request smuggling / desync risk). This affects any POST/PUT through a custom socket-producing agent (e.g. tunnel) whose body contains non-ASCII characters and doesn't set an explicit Content-Length or Transfer-Encoding.

Fix

Use byte length instead of .length:

if (chunks)
  for (let i = 0; i < chunks.length; i++)
    bodyLength += typeof chunks[i] === "string" ? Buffer.byteLength(chunks[i]) : chunks[i].length;

(or unconditionally Buffer.byteLength(chunks[i]), which handles both strings and Buffers).

Comment on lines +322 to +336
_addHeaderLine(field, value, dest) {
const lower = field.toLowerCase();
const existing = dest[lower];

if (lower === "set-cookie") {
if (existing !== undefined) {
existing.push(value);
} else {
dest[lower] = [value];
}
return;
}

if (existing === undefined) {
dest[lower] = value;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 When the parser delivers a header whose lowercased name collides with an Object.prototype key (e.g. Constructor, toString, valueOf), dest[lower] reads the inherited function instead of undefined, so the first occurrence falls into the comma-join branch and res.headers.constructor becomes 'function Object() { [native code] }, value'. Node's _addHeaderLine avoids this by checking typeof dest[field] === 'string' before joining — mirroring that check (or using Object.hasOwn) fixes it without changing the headers object's prototype.

Extended reasoning...

What the bug is

IncomingMessage.prototype._addHeaderLine (added in this PR for the socket-driven client path) folds parser-delivered headers into this.headers, which is a plain {} (matching Node — Bun's own test at node-http.test.ts asserts req.headers.__proto__ === {}.__proto__). The function reads:

const existing = dest[lower];
...
if (existing === undefined) {
  dest[lower] = value;
  return;
}
...
default:
  dest[lower] = existing + ", " + value;

Because dest inherits from Object.prototype, dest['constructor'] is the Object function, dest['tostring'] is undefined but dest['toString'] would be a function — and crucially, after lowercasing, 'constructor', 'valueof', 'hasownproperty', 'isprototypeof', 'propertyisenumerable', 'tolocalestring' all resolve to inherited members rather than undefined. (Of those, only 'constructor' survives lowercasing exactly — but 'constructor' is the realistic one and is even covered by this PR's own CONNECT-path test.)

Code path

  1. A request goes through a custom socket-producing agent (e.g. tunnel), so this[kCustomSocketPath] = true and tickOnSocket wires up an HTTPParser in RESPONSE mode.
  2. parserOnHeadersComplete (in _http_common.ts) constructs new IncomingMessage(socket), which takes the new branch in _http_incoming.ts that sets this.headers = {}.
  3. The parser calls incoming._addHeaderLines(headers, n)_addHeaderLine('Constructor', 'own', this.headers).
  4. lower = 'constructor'; existing = dest['constructor'] is Object (the inherited constructor), not undefined.
  5. 'constructor' is not 'set-cookie' and is not in the singleton switch list, so the default branch runs: dest['constructor'] = Object + ', ' + 'own''function Object() { [native code] }, own'.

Why existing code doesn't prevent it

The function checks existing === undefined to decide "first occurrence". On a regular-prototype object that test is false for any key that shadows an Object.prototype member. The CONNECT fast path in _http_client.ts (also added in this PR) explicitly avoids this by using { __proto__: null } for parsedHeaders and even has a test asserting headers['constructor'] === 'own' — but _addHeaderLine operates on this.headers, which intentionally keeps Object.prototype for Node compatibility, so the same === undefined guard is insufficient there.

Step-by-step proof

With a tunneled response of:

HTTP/1.1 200 OK\r\nConstructor: own\r\n\r\n
  • field = 'Constructor', value = 'own', dest = {}.
  • lower = 'constructor'.
  • existing = dest['constructor'][Function: Object] (inherited).
  • lower !== 'set-cookie'; existing !== undefined; switch has no 'constructor' case → default: dest['constructor'] = String(Object) + ', own'.
  • Bun result: res.headers.constructor === 'function Object() { [native code] }, own'.
  • Node result: Node's _addHeaderLine checks if (typeof dest[field] === 'string') before joining; typeof Object === 'function', so it takes the "first occurrence" branch and assigns dest['constructor'] = 'own'.

Note that two parts of the original report were inaccurate and are not divergences: (a) Node's IncomingMessage.headers is a regular object, not null-prototype; (b) a __proto__: x header is silently dropped in both Node and Bun (assigning a string via the __proto__ setter no-ops). Only the constructor-style collision actually differs.

Impact

A response header literally named Constructor (or another Object.prototype key that survives lowercasing) is corrupted on the custom-agent / tunnel path, diverging from Node. This is an extreme edge case — such header names essentially never appear in real HTTP — hence nit.

Fix

Match Node's guard — replace the existing === undefined test with a string-type check (or own-property check), e.g.:

if (typeof existing !== 'string') {
  dest[lower] = lower === 'set-cookie' ? [value] : value;
  return;
}

(or keep the current structure and just swap existing === undefinedtypeof existing !== 'string' in the non-set-cookie branch). Do not make this.headers null-prototype — that would break the existing req.headers.__proto__ === {}.__proto__ test and diverge from Node.

Comment on lines +379 to +382
if (!hasHost) {
const host = req[kHost];
head += `Host: ${req[kUseDefaultPort] ? host : `${host}:${req[kPort]}`}${CRLF}`;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 When no Host header is set, buildRequestHead() emits Host: ${host}:${port} without bracketing IPv6 literals, so a request like http.request({ host: '::1', port: 8080, agent: tunnelAgent }) sends Host: ::1:8080 instead of Host: [::1]:8080. Node brackets IPv6 in the default Host header (see the commented-out reference block in this file), and this PR's own startConnect() path does it via isIPv6(hostHeader) — the same check should be applied here.

Extended reasoning...

What the bug is

buildRequestHead() serializes the request line and headers that are written over a custom agent's socket (the new socket-driven path). When the caller has not set a Host header, it synthesizes one:

if (!hasHost) {
  const host = req[kHost];
  head += `Host: ${req[kUseDefaultPort] ? host : `${host}:${req[kPort]}`}${CRLF}`;
}

req[kHost] is the raw options.host / options.hostname value. When that is an IPv6 literal such as '::1' or 'fe80::1', this produces Host: ::1:8080. RFC 7230 §5.4 / RFC 3986 §3.2.2 require IPv6 literals in an authority to be enclosed in square brackets so the port colon is unambiguous: the correct header is Host: [::1]:8080 (or Host: [::1] for the default port).

Code path that triggers it

This is reachable on the new custom-agent path introduced by this PR. The constructor sets this[kCustomSocketPath] = true when the agent overrides addRequest and exposes no .proxy href, and stores the unmodified host in this[kHost]. When the agent later calls req.onSocket(socket), onSocketNT → tickOnSocket → req._flushToSocket() runs socket.write(buildRequestHead(this), 'latin1'), hitting the !hasHost branch above whenever the user did not set an explicit Host header.

Why nothing else prevents it

Bun's normal client path goes through native fetch, which constructs the URL itself (and brackets IPv6 in getURL). The startConnect() path added in this same PR explicitly handles it:

let hostHeader = this[kHost];
if (isIPv6(hostHeader)) {
  hostHeader = `[${hostHeader}]`;
}

But buildRequestHead() — the third place that emits a default Host header — has no equivalent check. The commented-out Node-derived block in this file ("For the Host header, ensure that IPv6 addresses are enclosed in square brackets…") confirms Node's _storeHeader does bracket IPv6, so this is also a Node-compat divergence.

Step-by-step proof

  1. User calls http.request({ host: '::1', port: 8080, agent: tunnel.httpOverHttp({ proxy: {...} }) }) with no headers.host.
  2. The constructor: this[kHost] = '::1', this[kPort] = 8080, this[kUseDefaultPort] = false (8080 ≠ 80). tunnel's agent overrides addRequest and has no .proxy href, so this[kCustomSocketPath] = true and agent.addRequest(this, opts) is invoked.
  3. tunnel establishes a CONNECT tunnel to the proxy and calls req.onSocket(tunneledSocket).
  4. onSocketNT → _flushToSocket → buildRequestHead(this) runs. req.getHeaders() has no host, so hasHost stays false.
  5. The default-Host branch executes: host = '::1', req[kUseDefaultPort] is false, so it emits Host: ::1:8080\r\n.
  6. Wire bytes: GET / HTTP/1.1\r\nHost: ::1:8080\r\nConnection: close\r\n\r\n. A strict origin server (or anything that parses the authority per RFC 3986) may reject this with 400, and even with the default port the unbracketed Host: ::1 diverges from Node's output.

Impact

Limited to the new custom-agent socket path with an IPv6-literal target host and no explicit Host header — uncommon, but a real spec violation and an internal inconsistency with both Node and startConnect() in this same PR.

Fix

Mirror startConnect():

if (!hasHost) {
  let host = req[kHost];
  if (isIPv6(host)) host = `[${host}]`;
  head += `Host: ${req[kUseDefaultPort] ? host : `${host}:${req[kPort]}`}${CRLF}`;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bun ignores tunnel.httpsOverHttp() and does not emit HTTP CONNECT proxy requests (Node mismatch)

1 participant