Connection Allowlists

Unofficial Proposal Draft,

This version:
https://wicg.github.io/connection-allowlists/
Issue Tracking:
GitHub
Inline In Spec
Editor:
Mike West (Google)

Abstract

The Connection-Allowlist mechanism provides a concise policy language and delivery mechanism for a set of constraints on a context’s ability to communicate with other servers. The goal is to provide developers with the ability to holistically mitigate explicit exfiltration channels in a way that’s narrowly tailored to suit the problem.

Status of this document

This specification was published by the Web Platform Incubator Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

1. Introduction

Developers wish to have control over the resources loaded into their pages' contexts and the endpoints to which their pages can make requests. This control is necessary for several purposes, including limiting the ways in which users' data can flow through the user agent (mitigating exfiltration attacks) and ensuring control over a site’s architecture and depedencies.

Content Security Policy addresses some of this need, but does so in a way that is more granular than necessary for the most critical use cases, and with a syntax and grammar that’s complicated by the other protections CSP is used to deploy. [CSP]

`Connection-Allowlist` steps back from CSP, and focuses on the single use case of controlling the explicit requests a page may initiate through Fetch and other web platform APIs (WebRTC, Web Transport, FedCM, Web Payments, DNS Prefetch, etc) in a way that aims to be straightforward and comprehensive.

NOTE: '\' line wrapping per RFC 8792

Connection-Allowlist: (response-origin "https://cdn.example" "https://*.example.:tld" \
                       "https://api.example:*"); report-to=ReportingAPIEndpoint

This header, delivered along with a document to which a user has navigated, would restrict that document to allow requests and connections only to those endpoints which matched the URL patterns [URLPATTERN] specified in the list: the origin from which the document was delivered, https://cdn.example, ny subdomain of any host whose penultimate DNS label is example, https://api.example on any port, and so on.

Attempts to connect to endpoints that don’t match the allowlist will be blocked, and reported via a Reporting API [REPORTING] endpoint specified in the report-to parameter (and defined through a separate Reporting-Endpoints header).

1.1. Threat Model

This proposal is intentionally small, targeting a specific but useful niche of client-side attacks and/or misconfigurations:

1.2. Overlap with Content Security Policy

This proposal has a lot in common with Content Security Policy’s approach to restrictions upon resource usage within a given context, fetch directives in particular. Still, it seems reasonable to explore for a few reasons:

  1. CSP’s model is too granular: Developers who wish to mitigate the risk that data flows out of a sensitive context require a protection that exhaustively covers the possible ways in which requests can be made or connections established. CSP’s categorization of requests into types which can be controlled in isolation is the wrong way to approach this problem, as data leaking through a request for a web font is just as bad as data leaking through a request for an image or a script. Distinguishing these request types complicates the process of designing a reasonable defense with questions that are simply irrelevant.

  2. CSP’s syntax is not granular enough: The host-source grammar CSP supports leads to truly verbose headers being delivered with responses. A distinct policy provides the opportunity to shift to the URLPattern syntax which will resolve some complaints folks have raised about CSP’s approach by providing a more modern, malleable, and standardized matching syntax.

  3. CSP’s coverage is incomplete: While CSP does a good job covering HTTP requests which run through Fetch, it does not exhaustively cover the myriad ways in which web platform APIs allow connections to be established. DNS prefetch and WebRTC are good examples to start with, but there are many others which have struggled with exactly how they fit into CSP’s threat model. By creating a new policy with a narrow focus and explicit promise to developers, these discussions will have a defensible answer and a clear mandate to specification authors.

2. Connection Allowlists

A Connection Allowlist represents the set of URL patterns to which a given context is allowed to connect. It is a struct with the following items:

3. Connection Allowlist Headers

The Connection-Allowlist response header contains a list of serialized URL pattern strings that define the set of endpoints to which a context is allowed to connect. This allowlist is enforced for a given context, blocking outgoing connections that don’t match the asserted patterns. The Connection-Allowlist-Report-Only response header is a report-only variant, parsed in the same way, but only sending violation reports without blocking outgoing connections.

These Connection Allowlist headers are structured headers whose values are a list of inner lists. Servers may deliver a list with an arbitrary number of items, but only the first will be used. Any additional items in the list will be ignored.

The inner list can contain either URL Patterns serialized as strings, or the token response-origin which represents a pattern matching the response’s URL’s origin. Unexpected values will be ignored.

The inner list may have arbitrary parameters:

All other parameters will be ignored.

3.1. Parsing

To parse a response’s Connection Allowlists given a response (response):
  1. Let allowlists be an empty list.

  2. Let header be the result of getting a structured field value named `Connection-Allowlist` as a list from response’s header list.

  3. Parse a Connection Allowlist header given header, response’s URL, and enforce. If the result is not null, insert it into allowlists.

  4. Let header be the result of getting a structured field value named `Connection-Allowlist-Report-Only` as a list from response’s header list.

  5. Parse a Connection Allowlist header given header, response’s URL, and report. If the result is not null, insert it into allowlists.

  6. Return allowlists.

To parse a Connection Allowlist header given a structured header list (list), a URL (response-url), and a disposition (disposition):
  1. If list’s size is 0, return null.

  2. If list[0] is not an inner list, return null.

  3. Let allowlist be a Connection Allowlist whose disposition is disposition.

  4. For each item in list[0]:

    1. Let serialized pattern be null.

    2. If item is the token response-origin:

      1. Set serialized pattern to the ASCII serialization of response-url’s origin.

    3. If item is a string, set serialized pattern to item.

    4. If serialized pattern is null, continue.

    5. Let URL pattern be the result of executing build a URL pattern from an HTTP structured field value given serialized pattern with null as the base URL.

      If this step throws an error, continue.

    6. Append URL pattern to allowlist’s allowlist.

  5. For each keyvalue in list[0]'s parameters:

    1. If key is report-to and value is a token, set allowlist’s reporting endpoint to value.

    2. If key is redirects and value is a token:

      1. If value is "block", set allowlist’s redirects to block.

      2. Else, set allowlist’s redirects to allow.

    3. If key is webrtc and value is a token:

      1. If value is "block", set allowlist’s webrtc to block.

      2. Else, set allowlist’s webrtc to allow.

  6. Return allowlist.

Note: We’re skipping over any invalid input in the parsing algorithm. We could plausibly be more draconian in our parsing, but that would likely limit our future flexibility.

3.2. Matching

Depending on the type of connection being established, we may have a request to work with, we may only have a URL, or we may have even less. dns-prefetch, for example, can only match against a host. The algorithms below spell out how connection allowlist checks work in these scenarios:

To match a URL to a Connection Allowlist given a URL (url) and a connection allowlist (connection allowlist), execute the following steps, which return success or failure.
  1. For each pattern in connection allowlist’s allowlist:

    1. If URL pattern matching given pattern and url does not return null, return success.

  2. Return failure.

To match a host to a Connection Allowlist given a host (host) and a connection allowlist (connection allowlist), execute the following steps, which return success or failure.
  1. For each pattern in connection allowlist’s allowlist:

    1. Let input be a new URLPatternInit dictionary whose hostname is set to pattern’s hostname component.

    2. Let host-only pattern be the result of creating a URL pattern given input, null as the base URL, and an empty map as the options.

    3. Let synthetic url be the result of parsing the concatenation of "https://" and host as a URL.

    4. If URL pattern matching given host-only pattern and synthetic url does not return null, return success.

  2. Return failure.

Note: By creating a new pattern with only the hostname component and synthesizing a URL for host, we’re able to return a match if _any_ pattern in the allowlist could allow a request to that host using any protocol, on any port, with any path, and so on.

The should url be blocked by Connection Allowlists algorithm takes a URL (url), an environment (environment), and a list of connection allowlists (connection allowlists). It returns either allowed or blocked:
  1. For each connection allowlist in connection allowlists:

    1. If url matches connection allowlist, continue.

    2. Report a violation given url, environment, and connection allowlist.

    3. If connection allowlist’s disposition is enforce, return blocked.

  2. Return allowed.

The should request be blocked by Connection Allowlists algorithm takes a request (request), and returns either allowed or blocked:
  1. Let allowlists be request’s policy container’s connection allowlists.

  2. For each allowlist in allowlists:

    1. If request’s URL list’s size is greater than 1:

      1. If allowlist’s redirects is allow, continue.

      2. Report a violation given request’s url, request’s client, and allowlist.

        Note: When redirects are blocked, we’re intentionally reporting request’s url and not its current url to avoid leaking more information than necessary about redirect targets.

      3. If allowlist’s disposition is enforce, return blocked.

      4. Continue.

    2. If request’s url matches allowlist, continue.

    3. Report a violation given request’s url, request’s client, and allowlist.

    4. If allowlist’s disposition is enforce, return blocked.

  3. Return allowed.

The should host be blocked by Connection Allowlists algorithm takes a host (host), an environment (environment), and a list of connection allowlists (connection allowlists). It returns either allowed or blocked:
  1. For each connection allowlist in connection allowlists:

    1. If host host-matches connection allowlist, continue.

    2. Report a violation given host, environment, and connection allowlist.

    3. If connection allowlist’s disposition is enforce, return blocked.

  2. Return allowed.

The should WebRTC be blocked by Connection Allowlists algorithm takes an environment settings object (environment) and returns either allowed or blocked:
  1. Let allowlists be environment’s policy container’s connection allowlists.

  2. For each allowlist in allowlists:

    1. If allowlist’s webrtc is allow, continue.

    2. Report a violation given "webrtc", environment, and allowlist.

    3. If allowlist’s disposition is enforce, return blocked.

  3. Return allowed.

3.3. Reporting

Like other policy mechanisms, Connection Allowlists will report each violation to a Reporting API endpoint specified in the allowlist headers. Violations are represented by the following dictionary type:

enum ConnectionAllowlistDisposition { "enforce", "report" };

dictionary ConnectionAllowlistViolationReport : ReportBody {
  USVString url;
  USVString connection;
  sequence<DOMString> allowlist;
  ConnectionAllowlistDisposition disposition;
};

ConnectionAllowlistViolationReport’s connection is the serialized URL of the connection which violated the allowlist.

ConnectionAllowlistViolationReport’s allowlist is the allowlist which was violated.

ConnectionAllowlistViolationReport’s disposition is the allowlist’s disposition.

To report a violation given a URL or the string "webrtc" (resource URL), an environment (environment), and a connection allowlist (allowlist):
  1. If allowlist’s reporting endpoint is null, return.

  2. Let violation be a new ConnectionAllowlistViolationReport, initialized as follows:

url

environment’s creation URL, stripped for use in reports.

connection

If resource URL is a URL, then resource URL, stripped for use in reports.

Otherwise, resource URL.

allowlist

A new list containing the result of serializing each pattern in allowlist’s allowlist

disposition

allowlist’s disposition.

  1. Generate and queue a report given environment as the context, "connection-allowlist" as the type, allowlist’s reporting endpoint as the destination, and violation as the data.

4. Monkey-Patches

4.1. Integration with Fetch

We’ll handle requests by adding a blocking check in Fetch § 4.1 Main fetch alongside other checks that serve the same purpose:

In Main Fetch, we’ll adjust step 7 as follows:
  1. If should request be blocked due to a bad port, should fetching request be blocked as mixed content, should request be blocked by Content Security Policy, should request be blocked by Connection Allowlists, or should request be blocked by Integrity Policy Policy returns blocked, then set response to a network error.

Fetch also defines algorithms at a lower level which are used to establish connections for APIs which aren’t based on requests. We’ll hook into resolve an origin and obtain a connection to handle things like DNS prefetch, Web Transport, etc:

In resolve an origin, we’ll call out to the host-only matching algorithm above to determine whether any pattern could potentially allow a connection to a given host. If not, we’ll fail resolution.
  1. If should host be blocked by Connection Allowlists returns blocked when executed upon origin’s host, environment, and allowlists, then return failure.
In obtain a connection, we’ll add a check before the current step 2:
  1. If should url be blocked by Connection Allowlists returns blocked when executed upon url, environment, and allowlists, then return failure.

The changes to Fetch will require us to pass additional information into low-level algorithms' callsites to identify the allowlist which ought to be used and the context to be used for reporting. It might be better to instead ask those callsites to perform the checks themselves. My feeling is that we’ll be more successful by centralizing the logic, but it might be simpler to take a piecemeal approach.

4.2. Integration with HTML

To integrate the above into HTML, we’ll add a new connection allowlists item to the policy container struct, containing a list of connection allowlists. This will be populated by adding a step to the create a policy container from a fetch response algorithm:

  1. Parse Integrity-Policy headers with response and result.

  2. Set result’s connection allowlists to the result of parsing a response’s Connection Allowlists given response.
  3. Return result.

4.3. Integration with WebRTC

To constrain WebRTC connections, [webrtc] can call into the should WebRTC be blocked by Connection Allowlists algorithm while determining whether candidates are administratively prohibited.

5. Security and Privacy Considerations

5.1. Same-Origin Contexts

The threat model described in § 1.1 Threat Model is intentionally narrow, and developers will need to carefully consider how to layer the allowlisting mechanism described here into their defenses. Most saliently, the mechanism is context-specific, not origin-wide. This leaves broad opportunity for an attacker with scripting access to bypass a context’s allowlist by finding a same-origin context with lower restrictions. Integration with HTML’s policy container addresses some of those possibilities, but it’s likely that others will exist. An attacker might, for example, be able to reach up through the frame tree to a less-restricted parent, or pop open a new window via window.open() which retains an opener relationship. Allowlisting the document’s origin (via response-origin or explicitly), is therefore not a complete solution in and of itself.

There are scenarios in which developers can avoid this risk by sandboxing the allowlisted context away from its normal origin via sandbox attributes or Content Security Policy’s sandbox directive. In those cases, no document will be same-origin, and the boundaries will be easier to hold.

It would also be ideal to give developers control over their dependencies' allowlists to some extent. An opt-in mechanism rooted in something like required document policy or [csp-embedded-enforcement]] might be helpful to explore. [WICG/connection-allowlists Issue #1]

5.2. Service Workers

Service Workers complicate the story around allowlists, just as other same-origin contexts do. Because they have a policy container distinct from each of the documents they manage, it’s quite possible for them to respond to messages or requests initiated in documents whose allowlist differs from the service worker’s allowlist. This proposal follows other policies' design, allowing for these differences in capability.

If developers wish to constrain service workers' ability to make requests, they can deliver an allowlist along with the worker script, but they’ll need to ensure that this allowlist is a superset of the allowlists of any document it might service.

5.3. DNS

Connection Allowlists rely on URL matching to determine whether a given endpoint is acceptable. This approach, like any other name-based mechanism, can be subverted if the name maps to an unexpected server. The allowlist depends on an accurate mapping of hosts to servers in order for the constraints it imposes to be meaningful.

Developers should mitigate the risk of DNS hijacking and/or rebinding by relying upon authenticated connections: allowlisting only secure protocols will make it much more difficult for an attacker in control over DNS to shift traffic to an arbitrary endpoint, as the TLS handshake will require possession of a certificate with the relevant name.

Should we restrict the allowlist’s patterns to those representing secure protocols? Or punt the header entirely for non-secure origins?

5.4. postMessage(...)

This proposal concerns itself entirely with network connections, which may surprise developers who would expect communication via explicit communication channels like postMessage(message, options), MessageChannel, BroadcastChannel, and so on to be covered. It could make sense to extend the model to include those as well, as they all fit into an origin-based model which could be meaningfully compared against the allowlist.

5.5. Redirects

By default, Connection Allowlists block all redirects. This is a conservative posture which aims to prevent data exfiltration via open redirects or other server-side redirection mechanisms. If an allowlist is enforced on a document, any request that results in a redirect will be blocked unless the allowlist explicitly opts into allowing them.

The redirects parameter allows developers to control this behavior.

If set to block (the default), any request with a redirect chain length greater than 1 will be blocked.

If set to allow, the allowlist will be enforced only on the initial request. If the initial request matches the allowlist, any subsequent redirect will be allowed regardless of its location. This mode shifts the responsibility for data security to the server: once a request has been allowed to leave the client, the server is responsible for ensuring that it does not redirect the user’s data to an untrusted location.

This approach acknowledges that different applications have different security requirements. Highly sensitive applications can choose to block all redirects, while others can rely on their trusted endpoints to handle redirects correctly.

Consider a document with the following header:
Connection-Allowlist: ("https://api.example")

A request to https://api.example/data that returns a 302 Found redirect to https://api.example/new-data will be blocked, as redirects are blocked by default.

If the header is instead:

Connection-Allowlist: ("https://api.example");redirects=allow

The same request to https://api.example/data will be allowed, and the subsequent redirect to https://api.example/new-data (or even https://attacker.com/) will also be allowed.

Finally, for forward-compatibility, an unknown token will be treated as allow:

Connection-Allowlist: ("https://api.example");redirects=some-future-policy

In this case, the redirects parameter is present but its value some-future-policy is unknown. The user agent will treat this as allow, and the redirect will be allowed. This allows us the possibility of introducing additional behaviors in the future without breaking existing sites.

5.6. WebRTC

By default, Connection Allowlists block all WebRTC connections. This is a conservative posture intended to mitigate the risk of data exfiltration through WebRTC’s unique networking characteristics, which can be difficult to constrain via URL patterns alone.

The webrtc parameter allows developers to control this behavior.

If set to block (the default), any attempt to establish a WebRTC connection will be blocked.

If set to allow, WebRTC connections will be allowed.

Consider a document with the following header:
Connection-Allowlist: ("https://api.example")

Any attempt to establish a WebRTC connection will be blocked, as WebRTC is blocked by default.

If the header is instead:

Connection-Allowlist: ("https://api.example"); webrtc=allow

WebRTC connections will be allowed.

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[CSP]
Mike West; Antonio Sartori. Content Security Policy Level 3. URL: https://w3c.github.io/webappsec-csp/
[DOCUMENT-POLICY]
Document Policy. Draft Community Group Report. URL: https://wicg.github.io/document-policy/
[FETCH]
Anne van Kesteren. Fetch Standard. Living Standard. URL: https://fetch.spec.whatwg.org/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[REPORTING]
Douglas Creager; Ian Clelland; Mike West. Reporting API. URL: https://w3c.github.io/reporting/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[RFC9651]
M. Nottingham; P-H. Kamp. Structured Field Values for HTTP. September 2024. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc9651
[SERVICE-WORKERS]
Monica CHINTALA; Yoshisato Yanagisawa. Service Workers Nightly. URL: https://w3c.github.io/ServiceWorker/
[URL]
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/
[URLPATTERN]
Ben Kelly; Jeremy Roman; 宍戸俊哉 (Shunya Shishido). URL Pattern Standard. Living Standard. URL: https://urlpattern.spec.whatwg.org/
[WEB-BLUETOOTH]
Jeffrey Yasskin. Web Bluetooth. URL: https://webbluetoothcg.github.io/web-bluetooth/
[WEBIDL]
Edgar Chen; Timothy Gu. Web IDL Standard. Living Standard. URL: https://webidl.spec.whatwg.org/
[WEBRTC]
Cullen Jennings; et al. WebRTC: Real-Time Communication in Browsers. URL: https://w3c.github.io/webrtc-pc/
[XHR]
Anne van Kesteren. XMLHttpRequest Standard. Living Standard. URL: https://xhr.spec.whatwg.org/

Informative References

[CSP-EMBEDDED-ENFORCEMENT]
Mike West. Content Security Policy: Embedded Enforcement. URL: https://w3c.github.io/webappsec-cspee/

IDL Index

enum ConnectionAllowlistDisposition { "enforce", "report" };

dictionary ConnectionAllowlistViolationReport : ReportBody {
  USVString url;
  USVString connection;
  sequence<DOMString> allowlist;
  ConnectionAllowlistDisposition disposition;
};

Issues Index

The changes to Fetch will require us to pass additional information into low-level algorithms' callsites to identify the allowlist which ought to be used and the context to be used for reporting. It might be better to instead ask those callsites to perform the checks themselves. My feeling is that we’ll be more successful by centralizing the logic, but it might be simpler to take a piecemeal approach.
It would also be ideal to give developers control over their dependencies' allowlists to some extent. An opt-in mechanism rooted in something like required document policy or [csp-embedded-enforcement]] might be helpful to explore. [WICG/connection-allowlists Issue #1]
Should we restrict the allowlist’s patterns to those representing secure protocols? Or punt the header entirely for non-secure origins?