Skip to content

Conversation

FelixVaughan
Copy link
Contributor

This relates to...

#4444

Rationale

Preserve header shape and correct Host after DNS resolution to avoid routing issues/header corruption

Changes

Added addHostHeader to dns interceptor and parameterized tests for different header types.

Bug Fixes

Fixed DNS interceptor turning [string, string][]/iterables into numeric-key headers

Status

const { InvalidArgumentError, InformationalError } = require('../core/errors')
const maxInt = Math.pow(2, 31) - 1

function addHostHeader (headers, host) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, you don't need to parse the headers by yourself, this is done by undici automatically, we just need to standardise the way that these are forward to undici.

If Array, push host to the headers array, if an object, attach host as a new property

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified the code!

A header case I'm slightly worried about though is non-array iterables.

request({ origin, path: '/', method: 'GET', headers: new Set([['f0', 'b0'], ['f1', 'b1']]) }).

These headers are valid under UndiciHeaders and a request like this (though rare) is possible; but the proposed array/object check would break this.

Another issue I've noticed is [string,string][] headers always raise an error due to the way header processing is handled in Request.js. Say headers = [['f0', 'b0']] or [['f0', 'b0'], ['f1', 'b1']]

if (Array.isArray(headers)) {
      if (headers.length % 2 !== 0) {
        throw new InvalidArgumentError('headers array must be even')
      }
      for (let i = 0; i < headers.length; i += 2) {
        processHeader(this, headers[i], headers[i + 1])
      }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These headers are valid under UndiciHeaders and a request like this (though rare) is possible; but the proposed array/object check would break this.

Usually these are iterables, and we iterate over them, but not use utility functions; if that's the case, we can try to also check for Symbol.iterator to decide what to do with them.

Say headers = [['f0', 'b0']] or [['f0', 'b0'], ['f1', 'b1']]

That's somehow expected, we only expect them in the format [string, string], anything outside should throw

Copy link
Contributor Author

@FelixVaughan FelixVaughan Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handling iterators might mean we can't avoid some normalization. However this can be streamlined with the normalizeHeaders function also used in cache.js

function normalizeHeaders (opts) {

diff --git a/lib/interceptor/dns.js b/lib/interceptor/dns.js
index 38287607..a878f39f 100644
--- a/lib/interceptor/dns.js
+++ b/lib/interceptor/dns.js
@@ -4,6 +4,15 @@ const { lookup } = require('node:dns')
 const DecoratorHandler = require('../handler/decorator-handler')
 const { InvalidArgumentError, InformationalError } = require('../core/errors')
 const maxInt = Math.pow(2, 31) - 1
+const { normalizeHeaders } = require('../util/cache.js')
+
+function addHostHeader (opts, host) {
+  if (Array.isArray(opts.headers)) {
+    return ['host', host, ...opts.headers]
+  }
+  const headers = normalizeHeaders(opts)
+  return { host, ...headers }
+}
 
 class DNSInstance {
   #maxTTL = 0
@@ -411,10 +420,7 @@ module.exports = interceptorOpts => {
           ...origDispatchOpts,
           servername: origin.hostname, // For SNI on TLS
           origin: newOrigin.origin,
-          headers: {
-            host: origin.host,
-            ...origDispatchOpts.headers
-          }
+          headers: addHostHeader(origDispatchOpts, origin.host)
         }
 
         dispatch(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... that's a good catch; I'd suggest to move the normalize headers to core/utils and implement it within the DNS interceptor first, and we can check if we wanna do the same for the core request in a further issue

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... that's a good catch; I'd suggest to move the normalize headers to core/utils and implement it within the DNS interceptor first, and we can check if we wanna do the same for the core request in a further issue

This makes sense to me

function addHostHeader (headers = {}, host) {
if (Array.isArray(headers)) {
const header = ['host', host]
if (Array.isArray(headers[0])) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really support them? They should throw if request receives a multi-dimensional array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They do throw for arrays, just not other iterables

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, let's do this: #4498 (comment)

@FelixVaughan FelixVaughan changed the title fix(dns): preserve header shape for [string,string][] and ensure Host fix(dns): preserve header shape and ensure Host Sep 4, 2025
}
return [...header, ...headers]
function addHostHeader (opts, host) {
if (Array.isArray(opts.headers)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normalizeHeaders will normalize it to a plain JS object, feel free to append it once normalized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for flat arrays for some reason. Refactored to accept types of [k1, v1, k2, v2, ...]

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.56%. Comparing base (2a72563) to head (daf988c).
⚠️ Report is 671 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4498      +/-   ##
==========================================
- Coverage   94.17%   93.56%   -0.61%     
==========================================
  Files          90      103      +13     
  Lines       24559    32366    +7807     
==========================================
+ Hits        23128    30283    +7155     
- Misses       1431     2083     +652     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

lib/core/util.js Outdated
if (typeof src[Symbol.iterator] === 'function') {
const msg = 'opts.headers is not a valid header map'
for (const s of src) {
if (!Array.isArray(s)) throw new Error(msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's an Undici error that is usable, let's make usage of it

@FelixVaughan FelixVaughan marked this pull request as ready for review September 7, 2025 15:04
lib/core/util.js Outdated
}

if (typeof src === 'object') {
for (const key of Object.keys(src)) headers[key.toLowerCase()] = src[key]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont we have somehwere a lookup function to potentially avoid the lowercase?

host: origin.host,
...origDispatchOpts.headers
}
headers: { host: origin.host, ...normalizeHeaders(origDispatchOpts) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you pass origDispatchOpts if your function is named normalizeHeaders?!

Suggested change
headers: { host: origin.host, ...normalizeHeaders(origDispatchOpts) }
headers: { host: origin.host, ...normalizeHeaders(origDispatchOpts.headers) }

lib/core/util.js Outdated
* @param {Record<string, string | string[]> | Iterable<[string, string]> | string[]}
* @returns {Record<string, string[] | string>}
*/
function normalizeHeaders (opts) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function normalizeHeaders (opts) {
function normalizeHeaders (src) {

Copy link
Contributor

@Uzlopak Uzlopak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1

@metcoder95
Copy link
Member

-1 on the algorithm or the feature itself?

@FelixVaughan
Copy link
Contributor Author

@Uzlopak the normalization function was moved out of lib/util/cache.js to lib/core/util.js, with some minor refactors

function normalizeHeaders (opts) {

Many suggestions are referring to things that were present in the initial implementation which were left unchanged to avoid introducing reversions.

@metcoder95
Copy link
Member

cc @Uzlopak

@Uzlopak
Copy link
Contributor

Uzlopak commented Sep 11, 2025

I know it is maybe a little bit frustrating situation right now. I am still thinking if this is the right thing to do. Also I want to evaluate it further and see what the rest of the code does after passing further down the line. Didnt have the muse to do it till now.

It creates alot of new intermediary objects and makes alot of toLowerCase() calls. And that just for the sake of normalizing the header.

What if it is an array? why not push key and value? Dont we have a logic, which overwrites the value if a key is already set? So why do we need to transform it to an object in that moment?

What if it is an Object? Afaik an Object.keys does not return the keys alphabetically but in the order of being assigned to the object. Should we rely on this behavior?

Should we not use wellknown header logic (see /lib/core/constants.js) to avoid potential toLowerCase() calls. Maybe it was already a bad design, when cache did not use that weelknownheader logic. Doesnt mean that we should ignore this perf regression

Or should we not just actually create a Header Instance?

@FelixVaughan
Copy link
Contributor Author

All great questions, some of which I don’t have immediate answers for either, but in general I think we should be utilizing the lowercase wellKnownHeader functionality already in place wherever possible.

For the arrays and objects, we normalize to avoid the case where the passed in header is neither an object in the traditional key/value sense or an array, but an iterable like a set; where a traditional push would lead to undefined behavior. We might be able to shortcircuit this by a straight append in the case of an array as you pointed out but objects and iterators are slightly more complex.

Also, though I think it makes sense to potentially tackle it in this pr, due to the pre-existing nature of the problem, should this perf concern be handled as a separate issue?

@Uzlopak
Copy link
Contributor

Uzlopak commented Sep 12, 2025

We should not transplant one bad implementation to other places and widely use it.

Right now only one person or few people complained regarding passing arrays into the interceptor.

Who is gonna fix that bad implementation later?

"A permanent solution is a temporary solutation that works." or other variation: "Temporary solutions often become permanent problems."

Also: We have multiple implementations of normalizeHeaders. One in snapshot. Also it is dangerous, because normalizeHeader is actually just normalizeRequestHeader.

Copy link
Contributor

@Uzlopak Uzlopak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per explaination.

@FelixVaughan
Copy link
Contributor Author

I understand your concern. I'd work on it if we move it to a separate issue. My main worry is potential scope creep; case in point the duplicate normalization in snapshot.

I've made the changes to address the above, but if we decide to tackle the rest here, we should lay out what all we want to change and I can handle it.

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants