Idle Detection API

Draft Community Group Report,

This version:
https://wicg.github.io/idle-detection/
Issue Tracking:
GitHub
Editor:
(Google LLC)

Abstract

This document defines a web platform API for observing system-wide user presence signals.

Status of this document

This specification was published by the Web Platform Incubator Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

1. Introduction

This section is non-normative.

Using existing capabilities a page is able to determine when it is currently visible to the user (using the hidden property and onvisibilitychange event). It is also possible to know when the user has recently interacted with the page by observing onmousemove, onkeypress, and other events triggered by user input. While sufficiently reflecting user engagement with a particular page these events give an incomplete picture of whether the user is still present at their device. For example, if hidden is true, then the device screensaver may have activated, or the user could have switched to a different application. If it is false but there have been no recent input events, then the user could have left their computer to grab a cup of coffee, or they could be editing a document in another window side-by-side with the page.

Making these distinctions is important for applications which have the option of delivering notifications across multiple devices, such as a desktop and smartphone. Users may find it frustrating when notifications are delivered to the wrong device or are disruptive. For example, if they switch from a tab containing a messaging application to one for a document they are editing, the messaging application, not being able to observe that the user is still interacting with their device, may assume that they have left to grab a coffee and start delivering notifications to their phone, causing it to buzz distractingly, instead of displaying notifications on their desktop or incrementing a badge count.

1.1. Alternatives Considered

An alternative design would protect this information by allowing a notification to be marked as "hide on active" or "hide on idle" and not allowing the page to observe whether or not the notification was actually shown. The problem with this approach is that the intelligent notification routing described previously requires observing these signals of user presence and making centralized decisions based on the state of all of the user’s devices.

For example, to route notifications to a user’s mobile device when they get up to grab a coffee the messaging application could detect that it is no longer visible and start sending push messages to the mobile device while marking the desktop notifications as "hide on active". If the user were still at their desk but using a different application then they would start getting the distracting notifications from their mobile device this proposal is attempting to avoid whether or not the desktop is able to successfully suppress them. Successful suppression of duplicate and disruptive notification requires multi-device coordination.

Allowing notifications to be hidden also breaks implementor mitigations for the [PUSH-API] being used to run silent background tasks.

2. Observing User Presence

2.1. Model

This specification defines a model for user presence on two dimensions: idle state and screen lock.

2.1.1. The UserIdleState enum

enum UserIdleState {
    "active",
    "idle"
};
"active"

Indicates that the user has interacted with the device in the last threshold milliseconds.

"idle"

Indicates that the user has not interacted with the device in at least threshold milliseconds.

2.1.2. The ScreenIdleState enum

enum ScreenIdleState {
    "locked",
    "unlocked"
};
"locked"

Indicates that the device has engaged a screensaver or lock screen which prevents content from being seen or interacted with.

"unlocked"

Indicates that the device is able to display content and be interacted with.

2.2. Permissions

The "idle-detection" powerful feature is a boolean feature.

2.3. Permissions policy

This specification defines a policy-controlled feature identified by the string "idle-detection". Its default allowlist is 'self'.

The default allowlist of 'self' allows usage of this feature on same-origin nested frames by default but prevents access by third-party content.

Third-party usage can be selectively enabled by adding the allow="idle-detection" attribute to an iframe element:

<iframe src="https://example.com" allow="idle-detection"></iframe>

Alternatively, this feature can be disabled completely in first-party contexts by specifying the permissions policy in an HTTP response header:

Permissions-Policy: idle-detection 'none'

See [PERMISSIONS-POLICY] for more details.

2.4. The IdleDetector interface

dictionary IdleOptions {
  [EnforceRange] unsigned long long threshold;
  AbortSignal signal;
};

[
  SecureContext,
  Exposed=(Window,DedicatedWorker)
] interface IdleDetector : EventTarget {
  constructor();
  readonly attribute UserIdleState? userState;
  readonly attribute ScreenIdleState? screenState;
  attribute EventHandler onchange;
  [Exposed=Window] static Promise<PermissionState> requestPermission();
  Promise<undefined> start(optional IdleOptions options = {});
};

Instances of IdleDetector are created with the internal slots described in the following table:

Internal slot Initial value Description (non-normative)
[[state]] "stopped" Tracks the active state of the IdleDetector
[[threshold]] undefined The configured idle detection threshold
[[userState]] null The last known user idle state
[[screenState]] null The last known screen idle state
Tests

Methods on this interface typically complete asynchronously, queuing work on the idle detection task source.

2.4.1. userState attribute

The userState getter steps are:
  1. Return this.[[userState]].

2.4.2. screenState attribute

The screenState getter steps are:
  1. Return this.[[screenState]].

2.4.3. onchange attribute

onchange is an Event handler IDL attribute for the change event type.

2.4.4. requestPermission() method

The requestPermission() method steps are:
  1. If the relevant global object of this does not have transient activation, return a new promise rejected with a "NotAllowedError" DOMException.

  2. Let result be a new promise.

  3. In parallel:

    1. Let permissionState be the result of requesting permission to use the powerful feature named "idle-detection".

    2. Queue a global task on the relevant global object of this using the permission task source to resolve result with permissionState.

  4. Return result.

2.4.5. start() method

The start(options) method steps are:

  1. Let result be a new promise.

  2. If the relevant global object's associated Document is not allowed to use the policy-controlled feature named "idle-detection" reject result with a "NotAllowedError" DOMException and return result.

    Tests
  3. If this.[[state]] is not "stopped", reject result with an "InvalidStateError" DOMException and return result.

    Tests
  4. Set this.[[state]] to "starting".

  5. If options["threshold"] is less than 60,000 reject result with TypeError and return result.

    Tests
  6. If options["signal"] is present, then perform the following sub-steps:

    1. If options["signal"]'s aborted flag is set, then reject result with an "AbortError" DOMException and return result.

    2. Add the following abort steps to options["signal"]:

      1. Set this.[[state]] to "stopped".

      2. Reject result with an "AbortError" DOMException.

    Tests
  7. Queue a global task on the relevant global object of this using the idle detection task source to perform the following steps, but abort when this.[[state]] becomes "stopped".

    1. Let permissionState be the permission state the powerful feature named "idle-detection".

    2. If permissionState is "denied", reject result with a "NotAllowedError" DOMException, set this.[[state]] to "stopped" and abort these steps.

      Tests
    3. Set this.[[state]] to "started".

    4. Set this.[[threshold]] to options["threshold"].

    5. Resolve result.

  8. Return result.

The availability of this API can be detected by looking for the IdleDetector constructor in the Window object.

if (!('IdleDetector' in window)) {
  console.log('Idle detection is not available.');
  return;
}

Calling start() will fail if the "idle-detection" permission has not been granted.

if ((await IdleDetector.requestPermission()) !== 'granted') {
  console.log('Idle detection permission not granted.');
  return;
}

A set of options can be configured to control the threshold the user agent uses to decide when the user has become idle.

const controller = new AbortController();
const signal = controller.signal;

const options = {
  threshold: 60_000,
  signal,
};

The IdleDetector can now be created and started. An listener for the "change" event is added and will be fired if the userState or screenState attributes change.

try {
  const idleDetector = new IdleDetector();
  idleDetector.addEventListener('change', () => {
    console.log(`Idle change: ${idleDetector.userState}, ${idleDetector.screenState}.`);
  });
  await idleDetector.start(options);
  console.log('IdleDetector is active.');
} catch (err) {
  // Deal with initialization errors like permission denied,
  // running outside of top-level frame, etc.
  console.error(err.name, err.message);
}

At a later time the page can cancel its interest in state change events by removing its event listeners or using the AbortSignal that was passed to start().

controller.abort();
console.log('IdleDetector is stopped.');

2.4.6. Reacting to state changes

For each IdleDetector instance detector where detector.[[state]] is "started" the user agent MUST continuously monitor the following conditions:

Tests

3. Security and privacy considerations

This section is non-normative.

3.1. Cross-origin information leakage

This interface exposes the state of global system properties and so care must be taken to prevent them from being used as cross-origin communication or identification channels. Similar concerns are present in specifications such as [DEVICE-ORIENTATION] and [GEOLOCATION-API], which mitigate them by requiring a visible or focused context. This prevents multiple origins from observing the global state at the same time. These mitigations are unfortunately inappropriate here because the intent of this specification is precisely to allow a limited form of tracking in blurred and hidden contexts. A malicious page could notify a tracking server whenever the user is detected as idle or active. If multiple pages the user was visiting notified the same server it could use the timing of the events to guess which sessions corresponded to a single user as they would arrive roughly simultaneously.

To reduce the number of independent contexts with access to this interface this specification restricts it to top-level and same-origin contexts. Access can be delegated to a cross-origin context through [PERMISSIONS-POLICY].

To further reduce the number of contexts this specification requires a page to obtain the "idle-detection" permission. User agents should inform the user of the capability that this permission grants and encourage them to only grant it to trusted sites which have a legitimate purpose for this data.

Implementations that provide a "private browsing" mode should not allow this capability in contexts where this mode is enabled. Implementations should be careful however to avoid the lack of this capability from being used as a signal that this mode is enabled. This can be accomplished by refusing to allow the "idle-detection" permission to be granted but delaying the automatic dismissal of the permission request by a random interval so that it appears to have been a user action.

3.2. Behavior tracking

While this interface does not provide details of the user interaction which triggered an "idle" to "active" transition, with a sufficiently short threshold these events could be used to detect behavior such as typing. This specification therefore restricts the requested threshold to a minimum of at least 60 seconds.

The permission requirement described previously also helps to mitigate the general concern that this interface can be used to build a profile of when and for how long the user typically interacts with their device.

3.3. User coercion

Sites may require the user to grant them the "idle-detection" permission before unlocking some functionality. For example, a testing site could require this permission as part of an anti-cheating mechanism to detect the user consulting forbidden reference materials in another window. This type of "Contract of Adhesion" has been observed with other permissions such as notifications, FIDO attestation and DRM identifiers.

A potential mitigation for this concern is to design that interface so that it is not possible for a site to determine whether the user has granted or denied the permission. An implementation could refuse to acknowledge that the user is idle, reducing a site to only the signals currently available. This mitigation could be detectable as it is unlikely that a user who has not interacted with a page for hours has nevertheless still been continuously interacting with something else. Implementations could instead insert fake idle transition events which correspond to plausible behavior given the other signals available to the page.

This specification does not mandate this type of mitigation as it could create a poor user experience when sites take action based on this false data. For example, the message application mentioned previously would not deliver notifications to the user’s mobile device because it believes the signals it has been given indicate that they are still at their desktop. As the site cannot detect that it is in this state it cannot directly recommend an action for the user to take to get themselves out of it.

The harm done by such a site is limited as tracking is only possible while the user is visiting that page. Tracking across multiple origins requires permission to be requested on each participating site.

4. Accessibility considerations

This section is non-normative.

Users with physical or cognitive impairments may require more time to interact with user agents and content. Implementations should not allow distinguishing such users, or limiting their ability to interact with content any more than existing observation of UI events. For example, implementation should ensure that interactions from assistive technologies count towards considering the user active.

The use of a permission also requires that user agents provide a user interface element to support requesting and managing that permission. Any such user interface elements must be designed with accessibility tools in mind. For example, a user interface describing the capability being requested should provide the same description to tools such as screen readers.

5. Internationalization considerations

This section is non-normative.

The interface described by this specification has limited internationalization considerations, however the use of a permission does require that user agents provide a user interface element to support requesting and managing that permission. Any content displayed by the user agent in this context should be translated into the user’s native language.

6. Acknowledgements

This section is non-normative.

Many thanks to Kenneth Christiansen, Samuel Goto, Ayu Ishii and Thomas Steiner for their help in crafting this proposal.

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Tests

Tests relating to the content of this specification may be documented in “Tests” blocks like this one. Any such block is non-normative.


Conformant Algorithms

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

Conformance requirements phrased as algorithms or specific steps can be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to understand and are not intended to be performant. Implementers are encouraged to optimize.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[DOM]
Anne van Kesteren. DOM Standard. Living Standard. URL: https://dom.spec.whatwg.org/
[ECMASCRIPT]
ECMAScript Language Specification. URL: https://tc39.es/ecma262/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[PAGE-VISIBILITY-2]
Ilya Grigorik; Arvind Jain; Jatinder Mann. Page Visibility Level 2. 17 October 2017. PR. URL: https://www.w3.org/TR/page-visibility-2/
[PERMISSIONS]
Mounir Lamouri; Marcos Caceres; Jeffrey Yasskin. Permissions. 20 July 2020. WD. URL: https://www.w3.org/TR/permissions/
[PERMISSIONS-POLICY]
Ian Clelland. Permissions Policy. 16 July 2020. WD. URL: https://www.w3.org/TR/permissions-policy-1/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[WebIDL]
Boris Zbarsky. Web IDL. 15 December 2016. ED. URL: https://heycam.github.io/webidl/

Informative References

[DEVICE-ORIENTATION]
Rich Tibbett; et al. DeviceOrientation Event Specification. 16 April 2019. WD. URL: https://www.w3.org/TR/orientation-event/
[FETCH]
Anne van Kesteren. Fetch Standard. Living Standard. URL: https://fetch.spec.whatwg.org/
[GEOLOCATION-API]
Andrei Popescu. Geolocation API Specification 2nd Edition. 8 November 2016. REC. URL: https://www.w3.org/TR/geolocation-API/
[PUSH-API]
Peter Beverloo; Martin Thomson. Push API. 4 February 2020. WD. URL: https://www.w3.org/TR/push-api/

IDL Index

enum UserIdleState {
    "active",
    "idle"
};

enum ScreenIdleState {
    "locked",
    "unlocked"
};

dictionary IdleOptions {
  [EnforceRange] unsigned long long threshold;
  AbortSignal signal;
};

[
  SecureContext,
  Exposed=(Window,DedicatedWorker)
] interface IdleDetector : EventTarget {
  constructor();
  readonly attribute UserIdleState? userState;
  readonly attribute ScreenIdleState? screenState;
  attribute EventHandler onchange;
  [Exposed=Window] static Promise<PermissionState> requestPermission();
  Promise<undefined> start(optional IdleOptions options = {});
};