Pending Beacon

Unofficial Proposal Draft,

More details about this document
This version:
http://wicg.github.io/pending-beacon/
Issue Tracking:
GitHub
Inline In Spec
Editor:
Ian Clelland (Google)

Abstract

This document introduces an API for registering data to be sent to a predetermined server at the point that a page is unloaded.

Status of this document

This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

GitHub Issues are preferred for discussion of this specification.

This document is governed by the 2 November 2021 W3C Process Document.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

1. Introduction

This is an introduction.

This introduction needs to be more of an introduction.

2. Pending Beacon Framework

2.1. Concepts

A pending beacon represents a piece of data which has been registered with the user agent for later sending to an origin server.

A pending beacon has a url, which is a URL.

A pending beacon has a method, which is a string, which is initally "POST".

A pending beacon has a foreground timeout, which is either null or an integer, and which is initially null.

A pending beacon has a background timeout, which is either null or an integer, and which is initially null.

A pending beacon has an is_pending flag, which is a boolean, which is initially true.

A pending beacon has a payload, which is a byte sequence. It is initially empty.

A Document has a pending beacon set, which is an ordered set of pending beacons.

Add worker beacons as well?

Note: In this spec, the pending beacon set is associated with a Document. In an actual implementation, this set will likely need to be stored in the user agent, separate from the document itself, in order to be able to send beacons when the document is destroyed (either by being unloaded, or because of a crash).

Define these to be part of the user agent formally.

2.2. Updating beacons

To set the url of a pending beacon beacon to a URL url:
  1. If beacon’s is_pending is false, return false.

  2. If url is not a valid URL, return false.

  3. If url is not a potentially trustworthy URL, return false.

  4. Set beacon’s url to url.

  5. Return true.

To set the foreground timeout of a pending beacon beacon to an integer timeout:
  1. If beacon’s is_pending is false, return false.

  2. If timeout is negative, return false.

  3. Set beacon’s foreground timeout to timeout.

  4. Return true.

This algorithm should also synchronously set or clear a timer to send the beacon.

To set the background timeout of a pending beacon beacon to an integer timeout:
  1. If beacon’s is_pending is false, return false.

  2. If timeout is negative, return false.

  3. Set beacon’s background timeout to timeout.

  4. Return true.

To set the payload of a pending beacon beacon to a byte sequence payload,
  1. If beacon’s is_pending is false, return false.

  2. Set beacon’s payload to payload.

  3. Return true.

To cancel a pending beacon beacon, set beacon’s is_pending to false.

Note: Once canceled, a pending beacon's payload will no longer be used, and it is safe for a user agent to discard that, and to cancel any associated timers. However, other attributes may still be read, and so this algorithm does not destroy the beacon itself.

2.3. Sending beacons

Note: This is written as though Fetch were used as the underlying mechanism. However, since these are sent out-of-band, an implementation might not use the actual web-exposed Fetch API, and may instead use the underlying HTTP primitives directly.

To send a document’s beacons, given a Document document, run these steps:
  1. For each pending beacon beacon in document’s pending beacon set,

    1. Call send a queued pending beacon with beacon.

To send a queued pending beacon beacon, run these steps:
  1. If beacon’s is_pending flag is false, then return.

  2. Set beacon’s is_pending flag to false.

  3. Check permission.

  4. If beacon’s method is "GET", then call send a pending beacon over GET with beacon.

  5. Else call send a pending beacon over POST with beacon.

"Check permission" is not defined. A specific permission should be used here, and this should integrate with the permissions API.

To send a pending beacon over GET, given a pending beacon beacon:
  1. Let pairs be the list « ("data", beacon’s payload) ».

  2. Let query be the result of running the urlencoded serializer with pairs.

  3. Let url be a clone of beacon’s url.

  4. Set url’s query component to query.

  5. Let req be a new request initialized as follows:

    method

    GET

    client

    The entry settings object

    url

    url

    credentials mode

    same-origin

  6. Fetch req.

To send a pending beacon over POST, given a pending beacon beacon:
  1. Let transmittedData be the result of serializing beacon’s payload.

  2. Let req be a new request initialized as follows:

    method

    POST

    client

    The entry settings object

    url

    beacon’s url

    header list

    headerList

    origin

    The entry settings object’s origin

    keep-alive flag

    true

    body

    transmittedData

    mode

    cors

    credentials mode

    same-origin

    1. Fetch req.

headerList is not defined.

3. Integration with HTML

Note: The following sections modify the [HTML] standard to enable sending of beacons automatically by the user agent. These should be removed from this spec as appropriate changes are made to [HTML].

When a document with a non-empty pending beacon set is to be discarded, send the document’s pending beacons.

"discarded" is not well defined.

When a process hosting a document with a non-empty pending beacon set crashes, send the document’s pending beacons.

The concepts of "process" and "crashes" are not well defined.

When a Document document is to become hidden (visibility state change), run these steps:
  1. For each pending beacon beacon in document’s pending beacon set,

  2. Let timeout be beacon’s background timeout.

  3. If timeout is not null, start a timer to run a task in timeout ms.

    Note: The user agent may choose to coalesce multiple timers in order to send multiple beacons at the same time.

  4. When the timer expires, call send a queued pending beacon with beacon.

    Note: The pending beacons may have been sent before this time, in cases where the document is unloaded, or its hosting process crashes before the timer fires. In that case, if the user agent still reaches this step, then the beacons will not be sent again, as their is_pending flag will be false.

"visibility state change" should be more specific here, and should refer to specific steps in either [PAGE-VISIBILITY] or [HTML]

This should also disable any foreground timers for the document’s beacons, and there should be a step to reinstate them if the document becomes visible again before they are sent.

4. The PendingBeacon interface

enum BeaconMethod {
    "POST",
    "GET"
};

dictionary PendingBeaconOptions {
    unsigned long timeout;
    unsigned long backgroundTimeout;
};

[Exposed=(Window, Worker)]
interface PendingBeacon {
    readonly attribute USVString url;
    readonly attribute BeaconMethod method;
    attribute unsigned long timeout;
    attribute unsigned long backgroundTimeout;
    readonly attribute boolean pending;

    undefined deactivate();
    undefined sendNow();
};

[Exposed=(Window, Worker)]
interface PendingGetBeacon : PendingBeacon {
    constructor(USVString url, optional PendingBeaconOptions options = {});

    undefined setURL(USVString url);
};

[Exposed=(Window, Worker)]
interface PendingPostBeacon : PendingBeacon {
    constructor(USVString url, optional PendingBeaconOptions options = {});

    undefined setData(object data);
};

A PendingBeacon object has an associated beacon, which is a pending beacon.

The new PendingGetBeacon(url, options) constructor steps are:
  1. Let beacon be a new pending beacon.

  2. Set this's beacon to beacon.

  3. Call the common beacon initialization steps with this, "GET", url and options.

  4. Insert beacon into the user agent’s pending beacon set.

The new PendingPostBeacon(url, options) constructor steps are:
  1. Let beacon be a new pending beacon.

  2. Set this's beacon to beacon.

  3. Call the common beacon initialization steps with this, "POST", url and options.

  4. Insert beacon into the user agent’s pending beacon set.

The common beacon initialization steps, given a PendingBeacon pendingBeacon, a string method, a USVString url, and a PendingBeaconOptions options, are:
  1. Let beacon be pendingBeacon’s beacon.

  2. If url is not a valid URL string, throw a TypeError.

  3. Let base be the entry settings object’s API base URL.

  4. Let parsedUrl be the result of running the URL parser on url and base.

  5. If parsedUrl is failure, throw a TypeError.

  6. If the result of setting beacon’s url to parsedUrl is false, throw a TypeError.

  7. Set beacon’s method to method.

  8. If options has a timeout member, then set pendingBeacon’s timeout to options’s timeout.

  9. If options has a backgroundTimeout member, then set pendingBeacon’s backgroundTimeout to options’s backgroundTimeout.

The url getter steps are to return this's beacon's url.
The method getter steps are to return this's beacon's method.
The timeout getter steps are to return this's beacon's foreground timeout.
The timeout setter steps are:
  1. Let beacon be this's beacon.

  2. If beacon’s is_pending is not true, throw a "NoModificationAllowedError" DOMException.

  3. Let timeout be the argument to the setter.

  4. If timeout is not a non-negative integer, throw a TypeError.

  5. If the result of setting beacon’s foreground timeout to timeout is false, throw a TypeError.

The backgroundTimeout getter steps are to return this's beacon's background timeout.
The backgroundTimeout setter steps are:
  1. Let beacon be this's beacon.

  2. If beacon’s is_pending is not true, throw a "NoModificationAllowedError" DOMException.

  3. Let timeout be the argument to the setter.

  4. If timeout is not a non-negative integer, throw a TypeError.

  5. If the result of setting beacon’s background timeout to timeout is false, throw a TypeError.

The pending getter steps are to return this's beacon's is_pending flag.
The deactivate() steps are:
  1. Let beacon be this's beacon.

  2. If beacon’s is_pending is not true, throw an "InvalidStateError" DOMException.

  3. cancel beacon.

The sendNow() steps are:
  1. Let beacon be this's beacon.

  2. If beacon’s is_pending is not true, throw an "InvalidStateError" DOMException.

  3. Call send a queued pending beacon with beacon.

The setURL(url) steps are:
  1. Let beacon be this's beacon.

  2. If beacon’s is_pending is not true, throw a "NoModificationAllowedError" DOMException.

  3. If url is not a valid URL string, throw a TypeError.

  4. Let base be the entry settings object’s API base URL.

  5. Let parsedUrl be the result of running the URL parser on url and base.

  6. If parsedUrl is failure, throw a TypeError.

  7. If the result of setting beacon’s url to parsedUrl is false, throw a TypeError.

The setData(data) steps are:
  1. Let beacon be this's beacon.

  2. If beacon’s is_pending is not true, throw a "NoModificationAllowedError" DOMException.

  3. Let (body, contentType) be the result of extracting a body with type from data with keepalive set to true.

  4. Let bytes be the byte sequence obtained by reading body’s stream.

  5. If the result of setting beacon’s payload to bytes is false, throw a TypeError.

5. Privacy

This section is woefully incomplete. These all need to be fleshed out in enough detail to accurately describe the privacy issues and suggested or prescribed mitigations.

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Conformant Algorithms

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

Conformance requirements phrased as algorithms or specific steps can be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to understand and are not intended to be performant. Implementers are encouraged to optimize.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[DOM]
Anne van Kesteren. DOM Standard. Living Standard. URL: https://dom.spec.whatwg.org/
[FETCH]
Anne van Kesteren. Fetch Standard. Living Standard. URL: https://fetch.spec.whatwg.org/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[SECURE-CONTEXTS]
Mike West. Secure Contexts. URL: https://w3c.github.io/webappsec-secure-contexts/
[URL]
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/
[WEBIDL]
Edgar Chen; Timothy Gu. Web IDL Standard. Living Standard. URL: https://webidl.spec.whatwg.org/

Informative References

[PAGE-VISIBILITY]
Jatinder Mann; Arvind Jain. Page Visibility (Second Edition). 29 October 2013. REC. URL: https://www.w3.org/TR/page-visibility/

IDL Index

enum BeaconMethod {
    "POST",
    "GET"
};

dictionary PendingBeaconOptions {
    unsigned long timeout;
    unsigned long backgroundTimeout;
};

[Exposed=(Window, Worker)]
interface PendingBeacon {
    readonly attribute USVString url;
    readonly attribute BeaconMethod method;
    attribute unsigned long timeout;
    attribute unsigned long backgroundTimeout;
    readonly attribute boolean pending;

    undefined deactivate();
    undefined sendNow();
};

[Exposed=(Window, Worker)]
interface PendingGetBeacon : PendingBeacon {
    constructor(USVString url, optional PendingBeaconOptions options = {});

    undefined setURL(USVString url);
};

[Exposed=(Window, Worker)]
interface PendingPostBeacon : PendingBeacon {
    constructor(USVString url, optional PendingBeaconOptions options = {});

    undefined setData(object data);
};

Issues Index

This introduction needs to be more of an introduction.
Add worker beacons as well?
Define these to be part of the user agent formally.
This algorithm should also synchronously set or clear a timer to send the beacon.
"Check permission" is not defined. A specific permission should be used here, and this should integrate with the permissions API.
headerList is not defined.
"discarded" is not well defined.
The concepts of "process" and "crashes" are not well defined.
"visibility state change" should be more specific here, and should refer to specific steps in either [PAGE-VISIBILITY] or [HTML]
This should also disable any foreground timers for the document’s beacons, and there should be a step to reinstate them if the document becomes visible again before they are sent.
This section is woefully incomplete. These all need to be fleshed out in enough detail to accurately describe the privacy issues and suggested or prescribed mitigations.