Speculation Rules

Draft Community Group Report,

This version:
https://wicg.github.io/nav-speculation/speculation-rules.html
Issue Tracking:
GitHub
Inline In Spec
Editor:
(Google)

Abstract

A flexible syntax for defining what outgoing links can be prepared speculatively before navigation.

Status of this document

This specification was published by the Web Platform Incubator Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

1. Speculation rules

1.1. Definitions

A speculation rule is a struct with the following items:

The only valid string for requirements to contain is "anonymous-client-ip-when-cross-origin".

A speculation rule set is a struct with the following items:

1.2. The script element

Note: This section contains modifications to the corresponding section of [HTML].

To process speculation rules consistently with the existing script types, we make the following changes:

The following algorithms are updated accordingly:

We should consider whether we also want to make this execute even if scripting is disabled.

We should also incorporate the case where a src attribute is set.

We could fire error and load events if we wanted to.

The following steps are added as the script element’s HTML element removing steps, given removedNode and oldParent:
  1. If removedNode’s result is a speculation rule set, then:

    1. Let document be removedNode’s node document.

    2. Remove it from document’s list of speculation rule sets.

    3. Set removedNode’s already started flag to false.

    4. Set removedNode’s result to null.

    5. Consider speculation for document.

    This means that the rule set can be reparsed if the script is reinserted.
The following steps are added as the script element’s children changed steps for an element scriptElement.
  1. If scriptElement’s result is a speculation rule set, then:

    1. Let document be scriptElement’s node document.

    2. Let ruleSet be scriptElement’s result.

    3. Let newResult be the result of parsing speculation rules given scriptElement’s child text content and document’s document base URL.

    4. Set scriptElement’s result to newResult.

    5. Replace ruleSet with newResult in document’s list of speculation rule sets.

    6. Consider speculation for document.

    This means that the rule set is reparsed immediately if inline changes are made.

1.3. Prepare the script element

Inside the prepare the script element algorithm we make the following changes:

1.4. Parsing

The general principle here is to allow the existence of directives which are not understood, but not to accept into the rule set a rule which the user agent does not fully understand. This reduces the risk of unintended activity by user agents which are unaware of most recently added directives which might limit the scope of a rule.

To parse speculation rules given a string input and a URL baseURL, perform the following steps. They return a speculation rule set or null.
  1. Let parsed be the result of parsing a JSON string to an Infra value given input.

  2. If parsed is not a map, then return null.

  3. Let result be an empty speculation rule set.

  4. If parsed["prefetch"] exists and is a list, then for each prefetchRule of parsed["prefetch"]:

    1. If prefetchRule is not a map, then continue.

    2. Let rule be the result of parsing a speculation rule given prefetchRule and baseURL.

    3. If rule is null, then continue.

    4. If rule’s target browsing context name hint is not null, then continue.

    5. Append rule to result’s prefetch rules.

  5. If parsed["prerender"] exists and is a list, then for each prerenderRule of parsed["prerender"]:

    1. If prerenderRule is not a map, then continue.

    2. Let rule be the result of parsing a speculation rule given prerenderRule and baseURL.

    3. If rule is null, then continue.

    4. Append rule to result’s prerender rules.

  6. Return result.

To parse a speculation rule given a map input and a URL baseURL, perform the following steps. They return a speculation rule or null.
  1. If input has any key other than "source", "urls", "requires", and "target_hint", then return null.

  2. If input["source"] does not exist or is not the string "list", then return null.

  3. Let urls be an empty list.

  4. If input["urls"] does not exist, is not a list, or has any element which is not a string, then return null.

  5. For each urlString of input["urls"]:

    1. Let parsedURL be the result of parsing urlString with baseURL.

    2. If parsedURL is failure, then continue.

    3. If parsedURL’s scheme is not an HTTP(S) scheme, then continue.

    4. Append parsedURL to urls.

  6. Let requirements be an empty ordered set.

  7. If input["requires"] exists, but is not a list, then return null.

  8. For each requirement of input["requires"]:

    1. If requirement is not the string "anonymous-client-ip-when-cross-origin", then return null.

    2. Append requirement to requirements.

  9. Let targetHint be null.

  10. If input["target_hint"] exists:

    1. If input["target_hint"] is not a valid browsing context name or keyword, then return null.

    2. Set targetHint to input["target_hint"].

  11. Return a speculation rule with URLs urls, requirements requirements, and target browsing context name hint targetHint.

1.5. Processing model

A document has a list of speculation rule sets, which is an initially empty list.

Periodically, for any document document, the user agent may queue a global task on the DOM manipulation task source with document’s relevant global object to consider speculation for document.

The user agent will likely do when resources are idle and available, or otherwise the circumstances of its previous decision whether to start a speculation could have changed.

A prefetch candidate is a struct with the following items:

A prerender candidate is a struct with the following items:

A prefetch candidate prefetchCandidate continues a prefetch record prefetchRecord if the following are all true:
To consider speculation for a document document:
  1. Await a stable state. Steps in the synchronous section are marked with ⌛.

  2. ⌛ If document is not fully active, then return.

    It’s likely that we should also handle prerendered and back-forward cached documents.

  3. ⌛ Let prefetchCandidates be an empty list.

  4. ⌛ Let prerenderCandidates be an empty list.

  5. ⌛ For each ruleSet of document’s list of speculation rule sets:

    1. For each rule of ruleSet’s prefetch rules:

      1. ⌛ Let anonymizationPolicy be null.

      2. ⌛ If rule’s requirements contains "anonymous-client-ip-when-cross-origin", set anonymizationPolicy to a cross-origin prefetch IP anonymization policy whose origin is document’s origin.

      3. For each url of rule’s URLs:

        1. Append a prefetch candidate with URL url and anonymization policy anonymizationPolicy to prefetchCandidates.

    2. For each rule of ruleSet’s prerender rules:

      1. For each url of rule’s URLs:

        1. ⌛ Let prerenderCandidate be a new prerender candidate whose URL is url and target browsing context name hint is rule’s target browsing context name hint.

        2. Append prerenderCandidate to prerenderCandidates.

  6. For each prefetchRecord of document’s prefetch records:

    1. ⌛ If prefetchRecord’s label is not "speculation-rules", then continue.

    2. Assert: prefetchRecord’s state is not "canceled".

    3. ⌛ If no element of prefetchCandidates continues prefetchRecord, then cancel and discard prefetchRecord given document.

  7. End the synchronous section, continuing the remaining steps in parallel.

  8. For each prefetchCandidate of prefetchCandidates:

    1. The user agent may run the following steps:

      1. Let prefetchRecord be a new prefetch record whose URL is prefetchCandidate’s URL, anonymization policy is prefetchCandidate’s anonymization policy, and label is "speculation-rules".

      2. Prefetch given document and prefetchRecord.

  9. For each prerenderCandidate of prefetchCandidates:

    1. The user agent may create a prerendering browsing context given prerenderCandidate’s URL and document.

      The user agent can use prerenderCandidate’s target browsing context name hint as a hint to their implementation of the create a prerendering browsing context algorithm. This hint indicates that the web developer expects the eventual activation of the created browsing context to be in place of a particular predecessor browsing context: the one that would be chosen by the invoking the rules for choosing a browsing context given prerenderCandidate’s target browsing context name hint and document’s browsing context.

      This is just a hint. The target browsing context name hint actually has no normative implications, after being parsed. It is still perfectly fine to activate in place of a different predecessor browsing context that was not hinted at.

We should also cancel speculated prerenders.

2. Security considerations

2.1. Cross-site request forgery

This specification allows documents to cause HTTP requests to be issued.

When any supported action acts on a URL which is same origin to the document, then this does not constitute a risk of cross-site request forgery, since the request uses only the credentials available to the document.

Otherwise, requests are always issued without using any previously existing credentials. This limits the ambient authority available to any potentially forged request, and such requests can already be made through [FETCH], a subresource or frame, or various other means. Site operators are therefore already well-advised to use CSRF tokens or other mitigations for this threat.

2.2. Cross-site scripting

This specification causes activity in response to content found in the document, so it is worth considering the options open to an attacker able to inject unescaped HTML.

Such an attacker is otherwise able to inject JavaScript, frames or other elements. The activity possible with this specification (requesting fetches etc) is generally less dangerous than arbitrary script execution, and comparable to other elements. The same mitigations available to other features also apply here. In particular, the [CSP] script-src directive applies to the parsing of the speculation rules and the prefetch-src directive applies to prefetch requests arising from the rules.

2.3. Type confusion

In the case of speculation rules in an inline <script>, an application which erroneously parsed speculation rules as a JavaScript script (though user agents are instructed not to execute scripts who "type" is unrecognized) would either interpret it as the empty block {} or produce a syntax error, since the U+003A COLON (:) after the first key is invalid JavaScript. In neither case would such an application execute harmful behavior.

Since the parsing behavior of the <script> element has long been part of HTML, any modern HTML parser would not construct any non-text children of the element. There is thus a low risk of other text hidden inside a <script> element with type="speculationrules" which is parsed as part of the script content by compliant HTML implementations but as HTML tags by others.

Authors should, however, still escape any potentially attacker-controlled content inserted into speculation rules. In particular, it may be necessary to escape JSON syntax as well as, if the speculation rules are in an inline <script> tag, the closing </script> tag. [CSP] is a useful additional mitigation for vulnerabilities of this type.

Expand this section once externally loaded (via "src") speculation rules are specified.

2.4. IP anonymization

This specification allows authors to request prefetch traffic using IP anonymization technology provided by the user agent. The details of this technology are not a part of this specification; nonetheless some general principles apply.

To the extent IP anonymization is implemented using a proxy service, it is advisable to minimize the information available to the service operator and other entities on the network path. This likely involves, at a minimum, the use of [TLS] for the connection.

Site operators should be aware that, similar to virtual private network (VPN) technology, the client IP address seen by the HTTP server may not exactly correspond to the user’s actual network provider or location, and a traffic for multiple distinct subscribers may originate from a single client IP address. This may affect site operators' security and abuse prevention measures. IP anonymization measures may make an effort to use an egress IP address which has a similar geolocation or is located in the same jurisdiction as the user, but any such behavior is particular to the user agent and not guaranteed by this specification.

3. Privacy considerations

3.1. Heuristics

Because the candidate prefetches and other actions are not required, the user agent can use heuristics to determine which actions would be best to execute. Because it may be observable to the document whether actions were executed, user agents must take care to protect privacy when making such decisions — for instance by only using information which is already available to the origin. If these heuristics depend on any persistent state, that state must be erased whenever the user erases other site data. If the user agent automatically clears other site data from time to time, it must erase such persistent state at the same time.

The use of origin here instead of site here is intentional. Origins generally form the basis for the web’s security boundary. Though same-site origins are generally allowed to coordinate if they wish, origins are generally not allowed access to data from other origins, even same-site ones.

Examples of inputs which would be already known to the document:

Examples of persistent data related to the origin (which the origin could have gathered itself) but which must be erased according to user intent:

Examples of device information which may be valuable in deciding whether prefetching is appropriate, but which must be considered as part of the user agent’s overall privacy posture because it may make the user more identifiable across origins:

3.2. Intent

While efforts have been made to minimize the privacy impact of prefetching, some users may nonetheless prefer that prefetching not occur, even though this may make loading slower. User agents are encouraged to provide a setting to disable prefetching features to accommodate such users.

3.3. Partitioning

Some user agents partition storage according to the site or origin of the top-level document. In order for prefetching and prerendering to be useful, it is therefore essential that prefetching or prerendering of a document either occur in the partition in which the navigation would occur (e.g., for a same-site URL) or in an isolated partition, so as to ensure that prefetching does not become a mechanism for bypassing the partitioning scheme.

Expand this section once more detail on prefetch and prerender partitioning mechanism is specified.

3.4. Identity joining

This specification describes a mechanism through which HTTP requests for later top-level navigation (in the case of prefetching) can be made without a user gesture. It is natural to ask whether it is possible for two coordinating sites to connect user identities.

Since existing credentials for the destination origin are not sent (assuming it is not same origin with the referrer), that site is limited in its ability to identify the user before navigation in a similar way to if the referrer site had simply used [FETCH] to make an uncredentialed request. Upon navigation, this becomes similar to ordinary navigation (e.g., by clicking a link that was not prefetched).

To the extent that user agents attempt to mitigate identity joining for ordinary fetches and navigations, they can apply similar mitigations to prefetched navigations.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[DOM]
Anne van Kesteren. DOM Standard. Living Standard. URL: https://dom.spec.whatwg.org/
[FETCH]
Anne van Kesteren. Fetch Standard. Living Standard. URL: https://fetch.spec.whatwg.org/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[URL]
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/

Informative References

[CSP]
Mike West; Antonio Sartori. Content Security Policy Level 3. URL: https://w3c.github.io/webappsec-csp/
[TLS]
E. Rescorla. The Transport Layer Security (TLS) Protocol Version 1.3. August 2018. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc8446

Issues Index

We should consider whether we also want to make this execute even if scripting is disabled.
We should also incorporate the case where a src attribute is set.
We could fire error and load events if we wanted to.
It’s likely that we should also handle prerendered and back-forward cached documents.
We should also cancel speculated prerenders.
Expand this section once externally loaded (via "src") speculation rules are specified.
Expand this section once more detail on prefetch and prerender partitioning mechanism is specified.