Connection Allowlists

1. Introduction

Developers wish to have control over the resources loaded into their pages' contexts and the endpoints to which their pages can make requests. This control is necessary for several purposes, including limiting the ways in which users' data can flow through the user agent (mitigating exfiltration attacks) and ensuring control over a site’s architecture and depedencies.

Content Security Policy addresses some of this need, but does so in a way that is more granular than necessary for the most critical use cases, and with a syntax and grammar that’s complicated by the other protections CSP is used to deploy. [CSP]

`Connection-Allowlist` steps back from CSP, and focuses on the single use case of controlling the explicit requests a page may initiate through Fetch and other web platform APIs (WebRTC, Web Transport, FedCM, Web Payments, DNS Prefetch, etc) in a way that aims to be straightforward and comprehensive.

NOTE: '\' line wrapping per RFC 8792

Connection-Allowlist: (response-origin "https://cdn.example" "https://*.example.:tld" \
                       "https://api.example:*"); report-to=ReportingAPIEndpoint

This header, delivered along with a document to which a user has navigated, would restrict that document to allow requests and connections only to those endpoints which matched the URL patterns [URLPATTERN] specified in the list: the origin from which the document was delivered, https://cdn.example, ny subdomain of any host whose penultimate DNS label is example, https://api.example on any port, and so on.

Attempts to connect to endpoints that don’t match the allowlist will be blocked, and reported via a Reporting API [REPORTING] endpoint specified in the report-to parameter (and defined through a separate Reporting-Endpoints header).

1.1. Threat Model

This proposal is intentionally small, targeting a specific but useful niche of client-side attacks and/or misconfigurations:

Policies for documents and workers will be asserted by servers as HTTP response headers. This means that attackers who can manipulate response headers will remain out of scope.
A document’s (or worker’s) asserted policy governs only requests initiated by _that_ context. If a framed document asserts a distinct policy, so be it (with the caveat that this policy will apply to contexts created via local schemes (data:, about:, etc.), similar to other components of a context’s policy container, which are handled by HTML when creating new document/worker contexts.
The connection and/or request are the threat the proposal aims to defend against. To be effective as an exfiltration defense, we must block connections before they’re made.
There are a plethora of side-channels available as unintended effects of otherwise-excellent web platform APIs. This proposal does not attempt to address them, focusing instead soley on those requests or connections explicitly initiated by the user agent on a page’s behalf. This certainly includes the clear cases of fetch() and XMLHttpRequest, along with resource requests generally. It also includes network connections established through channels that are less explicitly "requests": manifests fetched through the Web Install API, connections to TURN/STUN servers via WebRTC, Web Transport channels, navigations, and so on. These are all in scope, while more esoteric channels like memory or CPU consumption, socket exhaustion, and XSLeaks in general are not.
This proposal addresses only communication channels. It does not aim to prevent (or even substantially mitigate) threats like content injection or cross-site scripting. It only can constrain the impact of such an attack _on those specific pages_ where the policy is in place, and should be considered only as one layer in a page’s defenses; it won’t be sufficient in itself.
It’s tempting to attempt to defend against a subset of server-side threats, like open redirects. It’s possible that we could do so in a proposal like this one, similar conceptually to what CSP does today. That said, CSP’s behavior around redirects is both a leak in itself, and has garnered inconsistently favorable feedback from developers and researchers alike. For simplicity’s sake, this proposal will generally treat redirect responses as match failures. That is, we’ll start with a draconian policy which will block any redirect response. It’s quite probable that we’ll shift this one way or another as use cases crystalize.

1.2. Overlap with Content Security Policy

This proposal has a lot in common with Content Security Policy’s approach to restrictions upon resource usage within a given context, fetch directives in particular. Still, it seems reasonable to explore for a few reasons:

CSP’s model is too granular: Developers who wish to mitigate the risk that data flows out of a sensitive context require a protection that exhaustively covers the possible ways in which requests can be made or connections established. CSP’s categorization of requests into types which can be controlled in isolation is the wrong way to approach this problem, as data leaking through a request for a web font is just as bad as data leaking through a request for an image or a script. Distinguishing these request types complicates the process of designing a reasonable defense with questions that are simply irrelevant.
CSP’s syntax is not granular enough: The host-source grammar CSP supports leads to truly verbose headers being delivered with responses. A distinct policy provides the opportunity to shift to the URLPattern syntax which will resolve some complaints folks have raised about CSP’s approach by providing a more modern, malleable, and standardized matching syntax.
CSP’s coverage is incomplete: While CSP does a good job covering HTTP requests which run through Fetch, it does not exhaustively cover the myriad ways in which web platform APIs allow connections to be established. DNS prefetch and WebRTC are good examples to start with, but there are many others which have struggled with exactly how they fit into CSP’s threat model. By creating a new policy with a narrow focus and explicit promise to developers, these discussions will have a defensible answer and a clear mandate to specification authors.

2. Connection Allowlists

A Connection Allowlist represents the set of URL patterns to which a given context is allowed to connect. It is a struct with the following items:

allowlist, which is a list of URL patterns. It is an empty list unless otherwise specified.
reporting endpoint, which is either null or a Reporting API endpoint. It is null unless otherwise specified.
disposition, which is either enforce or report.

3. Connection Allowlist Headers

The Connection-Allowlist response header contains a list of serialized URL pattern strings that define the set of endpoints to which a context is allowed to connect. This allowlist is enforced for a given context, blocking outgoing connections that don’t match the asserted patterns. The Connection-Allowlist-Report-Only response header is a report-only variant, parsed in the same way, but only sending violation reports without blocking outgoing connections.

These Connection Allowlist headers are structured headers whose values are a list of inner lists. Servers may deliver a list with an arbitrary number of items, but only the first will be used. Any additional items in the list will be ignored.

The inner list can contain either URL Patterns serialized as strings, or the token response-origin which represents a pattern matching the response’s URL’s origin. Unexpected values will be ignored.

The inner list may have arbitrary parameters. The report-to parameter’s value will be parsed as a token representing a Reporting API endpoint [REPORTING]. All other parameters will be ignored.

3.1. Parsing

To parse a response’s Connection Allowlists given a response (response):

Let allowlists be an empty list.
Let header be the result of getting a structured field value named `Connection-Allowlist` as a list from response’s header list.
Parse a Connection Allowlist header given header, response’s URL, and enforce. If the result is not null, insert it into allowlists.
Let header be the result of getting a structured field value named `Connection-Allowlist-Report-Only` as a list from response’s header list.
Parse a Connection Allowlist header given header, response’s URL, and report. If the result is not null, insert it into allowlists.
Return allowlists.

To parse a Connection Allowlist header given a structured header list (list), a URL (response-url), and a disposition (disposition):

If list’s size is 0, return null.
If list[0] is not an inner list, return null.
Let allowlist be a Connection Allowlist whose disposition is disposition.
For each item in list[0]:
1. Let serialized pattern be null.
2. If item is the token response-origin:
  1. Set serialized pattern to the ASCII serialization of response-url’s origin.
3. If item is a string, set serialized pattern to item.
4. If serialized pattern is null, continue.
5. Let URL pattern be the result of executing build a URL pattern from an HTTP structured field value given serialized pattern with null as the base URL.
  
  If this step throws an error, continue.
6. Append URL pattern to allowlist’s allowlist.
For each key → value in list[0]'s parameters:
1. If key is report-to and value is a token, set allowlist’s reporting endpoint to value.
Return allowlist.

Note: We’re skipping over any invalid input in the parsing algorithm. We could plausibly be more draconian in our parsing, but that would likely limit our future flexibility.

3.2. Matching

Depending on the type of connection being established, we may have a request to work with, we may only have a URL, or we may have even less. dns-prefetch, for example, can only match against a host. The algorithms below spell out how connection allowlist checks work in these scenarios:

To match a URL to a Connection Allowlist given a URL (url) and a connection allowlist (connection allowlist), execute the following steps, which return success or failure.

For each pattern in connection allowlist’s allowlist:
1. If URL pattern matching given pattern and url does not return null, return success.
Return failure.

To match a host to a Connection Allowlist given a host (host) and a connection allowlist (connection allowlist), execute the following steps, which return success or failure.

For each pattern in connection allowlist’s allowlist:
1. Let input be a new URLPatternInit dictionary whose hostname is set to pattern’s hostname component.
2. Let host-only pattern be the result of creating a URL pattern given input, null as the base URL, and an empty map as the options.
3. Let synthetic url be the result of parsing the concatenation of "https://" and host as a URL.
4. If URL pattern matching given host-only pattern and synthetic url does not return null, return success.
Return failure.

Note: By creating a new pattern with only the hostname component and synthesizing a URL for host, we’re able to return a match if _any_ pattern in the allowlist could allow a request to that host using any protocol, on any port, with any path, and so on.

The should url be blocked by Connection Allowlist algorithm takes a URL (url), an environment (environment), and a list of connection allowlists (connection allowlists). It returns either allowed or blocked:

For each connection allowlist in connection allowlists:
1. If url matches connection allowlist’s allowlist, continue.
2. Report a violation given url, environment, and connection allowlist.
3. If connection allowlist’s disposition is enforce, return blocked.
Return allowed.

The should request be blocked by Connection Allowlists algorithm takes a request (request), and returns either allowed or blocked:

If request’s URL list’s size is greater than 1, return blocked.

See the open question below in § 5.3 Redirects.
Return the result of executing should url be blocked by Connection Allowlist given request’s url, request’s client, and request’s policy container’s connection allowlists.

The should host be blocked by Connection Allowlists algorithm takes a host (host), an environment (environment), and a list of connection allowlists (connection allowlists). It returns either allowed or blocked:

For each connection allowlist in connection allowlists:
1. If host host-matches connection allowlist, continue.
2. Report a violation given host, environment, and connection allowlist.
3. If connection allowlist’s disposition is enforce, return blocked.
Return allowed.

3.3. Reporting

Like other policy mechanisms, Connection Allowlists will report each violation to a Reporting API endpoint specified in the allowlist headers. Violations are represented by the following dictionary type:

enum ConnectionAllowlistDisposition { "enforce", "report" };

dictionary ConnectionAllowlistViolationReport : ReportBody {
  USVString url;
  USVString connection;
  sequence<DOMString> allowlist;
  ConnectionAllowlistDisposition disposition;
};

ConnectionAllowlistViolationReport’s connection is the serialized URL of the connection which violated the allowlist.

ConnectionAllowlistViolationReport’s allowlist is the allowlist which was violated.

ConnectionAllowlistViolationReport’s disposition is the allowlist’s disposition.

To report a violation given a URL (resource URL), an environment (environment), and a connection allowlist (allowlist):

If allowlist’s reporting endpoint is null, return.
Let violation be a new ConnectionAllowlistViolationReport, initialized as follows:

connection

resource URL, stripped for use in reports.

Note: Because we block redirects, we don’t need to worry about url vs current url. When we go back on that decision (see § 5.3 Redirects), we’ll want to ensure we use url to avoid leaking more information than necessary about redirect targets.

allowlist

allowlist’s allowlist

disposition

allowlist’s disposition.

Generate and queue a report given environment as the context, "connection-allowlist" as the type, allowlist’s reporting endpoint as the destination, and violation as the data.

4. Monkey-Patches

4.1. Integration with Fetch

We’ll handle requests by adding a blocking check in Fetch § 4.1 Main fetch alongside other checks that serve the same purpose:

In Main Fetch, we’ll adjust step 7 as follows:

If should request be blocked due to a bad port, should fetching request be blocked as mixed content, should request be blocked by Content Security Policy, should request be blocked by Connection Allowlists, or should request be blocked by Integrity Policy Policy returns blocked, then set response to a network error.

Fetch also defines algorithms at a lower level which are used to establish connections for APIs which aren’t based on requests. We’ll hook into resolve an origin and obtain a connection to handle things like DNS prefetch, Web Transport, etc:

In resolve an origin, we’ll call out to the host-only matching algorithm above to determine whether any pattern could potentially allow a connection to a given host. If not, we’ll fail resolution.

If should host be blocked by Connection Allowlists returns blocked when executed upon origin’s host, environment, and allowlists, then return failure.

In obtain a connection, we’ll add a check before the current step 2:

If should url be blocked by Connection Allowlist returns blocked when executed upon url, environment, and allowlists, then return failure.

The changes to Fetch will require us to pass additional information into low-level algorithms' callsites to identify the allowlist which ought to be used and the context to be used for reporting. It might be better to instead ask those callsites to perform the checks themselves. My feeling is that we’ll be more successful by centralizing the logic, but it might be simpler to take a piecemeal approach.

4.2. Integration with HTML

To integrate the above into HTML, we’ll add a new connection allowlists item to the policy container struct, containing a list of connection allowlists. This will be populated by adding a step to the create a policy container from a fetch response algorithm:

Parse Integrity-Policy headers with response and result.
Set result’s connection allowlists to the result of parsing a response’s Connection Allowlists given response.
Return result.

4.3. Integration with WebRTC

I need to read more, as I have no idea how any of this works from a spec perspective. :)

5. Security Considerations

5.1. Same-Origin Contexts

The threat model described in § 1.1 Threat Model is intentionally narrow, and developers will need to carefully consider how to layer the allowlisting mechanism described here into their defenses. Most saliently, the mechanism is context-specific, not origin-wide. This leaves broad opportunity for an attacker with scripting access to bypass a context’s allowlist by finding a same-origin context with lower restrictions. Integration with HTML’s policy container addresses some of those possibilities, but it’s likely that others will exist. Allowlisting the document’s origin (via response-origin or explicitly), reaching up through the frame tree, etc.

There are scenarios in which developers can avoid this risk by sandboxing the allowlisted context away from its normal origin via sandbox attributes or Content Security Policy’s sandbox directive. In those cases, no document will be same-origin, and the boundaries will be easier to hold.

It would also be ideal to give developers control over their dependencies' allowlists to some extent. An opt-in mechanism rooted in something like required document policy or [csp-embedded-enforcement]] might be helpful to explore. [WICG/connection-allowlists Issue #1]

5.2. `postMessage(...)`

This proposal concerns itself entirely with network connections, which may surprise developers who would expect communication via explicit communication channels like postMessage(message, options), MessageChannel, BroadcastChannel, and so on to be covered. It could make sense to extend the model to include those as well, as they all fit into an origin-based model which could be meaningfully compared against the allowlist.

5.3. Redirects

Currently, we specify that any redirected URL fails. This simplifies the initial proposal for discussion and ensures we don’t leak data, but seems unlikely to satisfy developers with real-world deployment needs. I think we have a few realistic options:

Apply the allowlist to every hop of a redirect chain. This has the advantage of matching CSP’s behavior that developers are already familiar with. It _is_ a cross-origin data leak insofar as it provides insight about another origin’s decisions, which is unfortunate but perhaps unavoidable (and non-unique).
Allow _a specific rule_’s redirect chain to arbitrarily redirect. This narrows the concerns above by forcing developers to annotate the allowlist with their expectations. It might be perfectly acceptable for https://trusted.example/ to redirect users to arbitrary locations, while other endpoints are expected to remain put. Annotating list items should make this kind of distinction possible if necessary (e.g. ("https://trusted.example/";redirection-allowed "https://less-so.example/")).
Narrow the above by allowing _a specific rule_ to redirect so long as the targets match the allowlist. This creates less opportunity for unexpected connection than 1 or 2 by requiring developers to annotate the specific rules which can redirect, but would do so in a way that’s less broad (e.g. ("https://semi-trusted.example/";redirection-allowed=within-allowlist ...)).

We could add more options as well. CSP’s earlier navigate-to proposal distinguished between intermediate redirects and the final, non-redirect response. You could imagine adding those kinds of options either to the entire allowlist or individual rules. Feedback here as well would be much appreciated.

Connection Allowlists

Abstract

Status of this document

1. Introduction

1.1. Threat Model

1.2. Overlap with Content Security Policy

2. Connection Allowlists

3. Connection Allowlist Headers

3.1. Parsing

3.2. Matching

3.3. Reporting

4. Monkey-Patches

4.1. Integration with Fetch

4.2. Integration with HTML

4.3. Integration with WebRTC

5. Security Considerations

5.1. Same-Origin Contexts

5.2. `postMessage(...)`

5.3. Redirects

Conformance

Document conventions

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

Informative References

IDL Index

Issues Index

Connection Allowlists

Abstract

Status of this document

1. Introduction

1.1. Threat Model

1.2. Overlap with Content Security Policy

2. Connection Allowlists

3. Connection Allowlist Headers

3.1. Parsing

3.2. Matching

3.3. Reporting

4. Monkey-Patches

4.1. Integration with Fetch

4.2. Integration with HTML

4.3. Integration with WebRTC

5. Security Considerations

5.1. Same-Origin Contexts

5.2. postMessage(...)

5.3. Redirects

Conformance

Document conventions

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

Informative References

IDL Index

Issues Index

5.2. `postMessage(...)`