1. Introduction
Developers wish to have control over the resources loaded into their pages' contexts and the endpoints to which their pages can make requests. This control is necessary for several purposes, including limiting the ways in which users' data can flow through the user agent (mitigating exfiltration attacks) and ensuring control over a site’s architecture and depedencies.
Content Security Policy addresses some of this need, but does so in a way that is more granular than necessary for the most critical use cases, and with a syntax and grammar that’s complicated by the other protections CSP is used to deploy. [CSP]
`Connection-Allowlist` steps back from CSP, and focuses on the single use case of controlling
the explicit requests a page may initiate through Fetch and other web platform APIs (WebRTC, Web
Transport, FedCM, Web Payments, DNS Prefetch, etc) in a way that aims to be straightforward and
comprehensive.
NOTE: '\' line wrapping per RFC 8792
Connection-Allowlist: (response-origin "https://cdn.example" "https://*.example.:tld" \
"https://api.example:*"); report-to=ReportingAPIEndpoint
This header, delivered along with a document to which a user has navigated, would restrict that
document to allow requests and connections only to those endpoints which matched the URL patterns
[URLPATTERN] specified in the list: the origin from which the document was delivered,
https://cdn.example, ny subdomain of any host whose penultimate DNS label is example,
https://api.example on any port, and so on.
Attempts to connect to endpoints that don’t match the allowlist will be blocked, and reported via
a Reporting API [REPORTING] endpoint specified in the report-to parameter (and defined
through a separate Reporting-Endpoints header).
1.1. Threat Model
This proposal is intentionally small, targeting a specific but useful niche of client-side attacks and/or misconfigurations:
-
Policies for documents and workers will be asserted by servers as HTTP response headers. This means that attackers who can manipulate response headers will remain out of scope.
-
A document’s (or worker’s) asserted policy governs only requests initiated by _that_ context. If a framed document asserts a distinct policy, so be it (with the caveat that this policy will apply to contexts created via local schemes (
data:,about:, etc.), similar to other components of a context’s policy container, which are handled by HTML when creating new document/worker contexts. -
The connection and/or request are the threat the proposal aims to defend against. To be effective as an exfiltration defense, we must block connections before they’re made.
-
There are a plethora of side-channels available as unintended effects of otherwise-excellent web platform APIs. This proposal does not attempt to address them, focusing instead soley on those requests or connections explicitly initiated by the user agent on a page’s behalf. This certainly includes the clear cases of
fetch()andXMLHttpRequest, along with resource requests generally. It also includes network connections established through channels that are less explicitly "requests": manifests fetched through the Web Install API, connections to TURN/STUN servers via WebRTC, Web Transport channels, navigations, and so on. These are all in scope, while more esoteric channels like memory or CPU consumption, socket exhaustion, and XSLeaks in general are not. -
This proposal addresses only communication channels. It does not aim to prevent (or even substantially mitigate) threats like content injection or cross-site scripting. It only can constrain the impact of such an attack _on those specific pages_ where the policy is in place, and should be considered only as one layer in a page’s defenses; it won’t be sufficient in itself.
-
It’s tempting to attempt to defend against a subset of server-side threats, like open redirects. It’s possible that we could do so in a proposal like this one, similar conceptually to what CSP does today. That said, CSP’s behavior around redirects is both a leak in itself, and has garnered inconsistently favorable feedback from developers and researchers alike. For simplicity’s sake, this proposal will generally treat redirect responses as match failures. That is, we’ll start with a draconian policy which will block any redirect response. It’s quite probable that we’ll shift this one way or another as use cases crystalize.
1.2. Overlap with Content Security Policy
This proposal has a lot in common with Content Security Policy’s approach to restrictions upon resource usage within a given context, fetch directives in particular. Still, it seems reasonable to explore for a few reasons:
-
CSP’s model is too granular: Developers who wish to mitigate the risk that data flows out of a sensitive context require a protection that exhaustively covers the possible ways in which requests can be made or connections established. CSP’s categorization of requests into types which can be controlled in isolation is the wrong way to approach this problem, as data leaking through a request for a web font is just as bad as data leaking through a request for an image or a script. Distinguishing these request types complicates the process of designing a reasonable defense with questions that are simply irrelevant.
-
CSP’s syntax is not granular enough: The
host-sourcegrammar CSP supports leads to truly verbose headers being delivered with responses. A distinct policy provides the opportunity to shift to the URLPattern syntax which will resolve some complaints folks have raised about CSP’s approach by providing a more modern, malleable, and standardized matching syntax. -
CSP’s coverage is incomplete: While CSP does a good job covering HTTP requests which run through Fetch, it does not exhaustively cover the myriad ways in which web platform APIs allow connections to be established. DNS prefetch and WebRTC are good examples to start with, but there are many others which have struggled with exactly how they fit into CSP’s threat model. By creating a new policy with a narrow focus and explicit promise to developers, these discussions will have a defensible answer and a clear mandate to specification authors.
2. Connection Allowlists
A Connection Allowlist represents the set of URL patterns to which a given context is allowed to connect. It is a struct with the following items:
-
allowlist, which is a list of URL patterns. It is an empty list unless otherwise specified.
-
reporting endpoint, which is either
nullor a Reporting API endpoint. It isnullunless otherwise specified. -
disposition, which is either enforce or report.
3. Connection Allowlist Headers
The Connection-Allowlist response header contains a list of serialized URL pattern strings that define the set of endpoints to which a context is allowed to connect. This allowlist is enforced for a given context, blocking outgoing connections that don’t match the asserted patterns. The Connection-Allowlist-Report-Only response header is a report-only variant, parsed in the same way, but only sending violation reports without blocking outgoing connections.
These Connection Allowlist headers are structured headers whose values are a list of inner lists. Servers may deliver a list with an arbitrary number of items, but only the first will be used. Any additional items in the list will be ignored.
The inner list can contain either URL Patterns serialized as
strings, or the token
response-origin which represents a pattern matching
the response’s URL’s origin. Unexpected values will be ignored.
The inner list may have arbitrary parameters. The
report-to parameter’s value will be parsed as a
token representing a Reporting API endpoint [REPORTING]. All other
parameters will be ignored.
3.1. Parsing
-
Let allowlists be an empty list.
-
Let header be the result of getting a structured field value named `
Connection-Allowlist` as a list from response’s header list. -
Parse a Connection Allowlist header given header, response’s URL, and enforce. If the result is not
null, insert it into allowlists. -
Let header be the result of getting a structured field value named `
Connection-Allowlist-Report-Only` as a list from response’s header list. -
Parse a Connection Allowlist header given header, response’s URL, and report. If the result is not
null, insert it into allowlists. -
Return allowlists.
-
If list’s size is 0, return
null. -
If list[0] is not an inner list, return
null. -
Let allowlist be a Connection Allowlist whose disposition is disposition.
-
For each item in list[0]:
-
Let serialized pattern be
null. -
If item is the token
response-origin:-
Set serialized pattern to the ASCII serialization of response-url’s origin.
-
-
If item is a string, set serialized pattern to item.
-
If serialized pattern is
null, continue. -
Let URL pattern be the result of executing build a URL pattern from an HTTP structured field value given serialized pattern with
nullas the base URL.If this step throws an error, continue.
-
-
For each key → value in list[0]'s parameters:
-
If key is
report-toand value is a token, set allowlist’s reporting endpoint to value.
-
-
Return allowlist.
Note: We’re skipping over any invalid input in the parsing algorithm. We could plausibly be more draconian in our parsing, but that would likely limit our future flexibility.
3.2. Matching
Depending on the type of connection being established, we may have a request to work with, we
may only have a URL, or we may have even less. dns-prefetch,
for example, can only match against a host. The algorithms below spell out how connection allowlist
checks work in these scenarios:
-
For each pattern in connection allowlist’s allowlist:
-
If URL pattern matching given pattern and url does not return
null, return success.
-
-
Return failure.
-
For each pattern in connection allowlist’s allowlist:
-
Let input be a new
URLPatternInitdictionary whosehostnameis set to pattern’s hostname component. -
Let host-only pattern be the result of creating a URL pattern given input,
nullas the base URL, and an empty map as the options. -
Let synthetic url be the result of parsing the concatenation of "https://" and host as a URL.
-
If URL pattern matching given host-only pattern and synthetic url does not return
null, return success.
-
-
Return failure.
Note: By creating a new pattern with only the hostname component and synthesizing a URL for host, we’re able to return a match if _any_ pattern in the allowlist could allow a request to that host using any protocol, on any port, with any path, and so on.
-
For each connection allowlist in connection allowlists:
-
Report a violation given url, environment, and connection allowlist.
-
If connection allowlist’s disposition is enforce, return blocked.
-
Return allowed.
-
If request’s URL list’s size is greater than 1, return blocked.
See the open question below in § 5.3 Redirects.
-
Return the result of executing should url be blocked by Connection Allowlist given request’s url, request’s client, and request’s policy container’s connection allowlists.
-
For each connection allowlist in connection allowlists:
-
If host host-matches connection allowlist, continue.
-
Report a violation given host, environment, and connection allowlist.
-
If connection allowlist’s disposition is enforce, return blocked.
-
-
Return allowed.
3.3. Reporting
Like other policy mechanisms, Connection Allowlists will report each violation to a Reporting API endpoint specified in the allowlist headers. Violations are represented by the following dictionary type:
enum {ConnectionAllowlistDisposition ,"enforce" };"report" dictionary :ConnectionAllowlistViolationReport ReportBody {USVString ;url USVString ;connection sequence <DOMString >;allowlist ConnectionAllowlistDisposition ; };disposition
ConnectionAllowlistViolationReport’s connection is the
serialized URL of the connection which violated the allowlist.
ConnectionAllowlistViolationReport’s allowlist is the
allowlist which was violated.
ConnectionAllowlistViolationReport’s disposition
is the allowlist’s disposition.
-
If allowlist’s reporting endpoint is
null, return. -
Let violation be a new
ConnectionAllowlistViolationReport, initialized as follows:
connection-
resource URL, stripped for use in reports.
Note: Because we block redirects, we don’t need to worry about url vs current url. When we go back on that decision (see § 5.3 Redirects), we’ll want to ensure we use url to avoid leaking more information than necessary about redirect targets.
allowlist-
allowlist’s allowlist
disposition-
allowlist’s disposition.
-
Generate and queue a report given environment as the context, "
connection-allowlist" as the type, allowlist’s reporting endpoint as the destination, and violation as the data.
4. Monkey-Patches
4.1. Integration with Fetch
We’ll handle requests by adding a blocking check in Fetch § 4.1 Main fetch alongside other checks that serve the same purpose:
-
If should request be blocked due to a bad port, should fetching request be blocked as mixed content, should request be blocked by Content Security Policy, should request be blocked by Connection Allowlists, or should request be blocked by Integrity Policy Policy returns blocked, then set response to a network error.
Fetch also defines algorithms at a lower level which are used to establish connections for APIs which aren’t based on requests. We’ll hook into resolve an origin and obtain a connection to handle things like DNS prefetch, Web Transport, etc:
- If should host be blocked by Connection Allowlists returns blocked when executed upon origin’s host, environment, and allowlists, then return failure.
- If should url be blocked by Connection Allowlist returns blocked when executed upon url, environment, and allowlists, then return failure.
The changes to Fetch will require us to pass additional information into low-level algorithms' callsites to identify the allowlist which ought to be used and the context to be used for reporting. It might be better to instead ask those callsites to perform the checks themselves. My feeling is that we’ll be more successful by centralizing the logic, but it might be simpler to take a piecemeal approach.
4.2. Integration with HTML
To integrate the above into HTML, we’ll add a new connection allowlists item to the policy container struct, containing a list of connection allowlists. This will be populated by adding a step to the create a policy container from a fetch response algorithm:
-
Parse Integrity-Policy headers with response and result.
- Set result’s connection allowlists to the result of parsing a response’s Connection Allowlists given response.
-
Return result.
4.3. Integration with WebRTC
I need to read more, as I have no idea how any of this works from a spec perspective. :)
5. Security Considerations
5.1. Same-Origin Contexts
The threat model described in § 1.1 Threat Model is intentionally narrow, and developers will need
to carefully consider how to layer the allowlisting mechanism described here into their defenses.
Most saliently, the mechanism is context-specific, not origin-wide. This leaves broad opportunity
for an attacker with scripting access to bypass a context’s allowlist by finding a same-origin
context with lower restrictions. Integration with HTML’s policy container addresses some of
those possibilities, but it’s likely that others will exist. Allowlisting the document’s origin
(via response-origin or explicitly), reaching up
through the frame tree, etc.
There are scenarios in which developers can avoid this risk by sandboxing the allowlisted context
away from its normal origin via sandbox attributes or Content Security Policy’s
sandbox directive. In those cases, no document will be same-origin, and the
boundaries will be easier to hold.
It would also be ideal to give developers control over their dependencies' allowlists to some extent. An opt-in mechanism rooted in something like required document policy or [csp-embedded-enforcement]] might be helpful to explore. [WICG/connection-allowlists Issue #1]
5.2. postMessage(...)
This proposal concerns itself entirely with network connections, which may surprise developers
who would expect communication via explicit communication channels like
postMessage(message, options), MessageChannel, BroadcastChannel, and so on to
be covered. It could make sense to extend the model to include those as well, as they all fit
into an origin-based model which could be meaningfully compared against the allowlist.
5.3. Redirects
Currently, we specify that any redirected URL fails. This simplifies the initial proposal for discussion and ensures we don’t leak data, but seems unlikely to satisfy developers with real-world deployment needs. I think we have a few realistic options:
-
Apply the allowlist to every hop of a redirect chain. This has the advantage of matching CSP’s behavior that developers are already familiar with. It _is_ a cross-origin data leak insofar as it provides insight about another origin’s decisions, which is unfortunate but perhaps unavoidable (and non-unique).
-
Allow _a specific rule_’s redirect chain to arbitrarily redirect. This narrows the concerns above by forcing developers to annotate the allowlist with their expectations. It might be perfectly acceptable for
https://trusted.example/to redirect users to arbitrary locations, while other endpoints are expected to remain put. Annotating list items should make this kind of distinction possible if necessary (e.g.("https://trusted.example/";redirection-allowed "https://less-so.example/")). -
Narrow the above by allowing _a specific rule_ to redirect so long as the targets match the allowlist. This creates less opportunity for unexpected connection than 1 or 2 by requiring developers to annotate the specific rules which can redirect, but would do so in a way that’s less broad (e.g.
("https://semi-trusted.example/";redirection-allowed=within-allowlist ...)).
We could add more options as well. CSP’s earlier navigate-to proposal distinguished between intermediate
redirects and the final, non-redirect response. You could imagine adding those kinds of options either to
the entire allowlist or individual rules. Feedback here as well would be much appreciated.