Abstract

This document defines a new API {{MediaDevices/getDisplayMediaSet}} for capturing multiple display surfaces following one user gesture. It is an extension to the Screen Capture API [[screen-capture]].

This document is a draft. It is subject to major changes and, while early experimentations are encouraged, it is therefore not intended for implementation.

Introduction

The Screen Capture API [[screen-capture]] enables the capturing of a single display surface in the form of a video track.

Users who want to capture multiple display surfaces at once through the Screen Capture API with getDisplayMedia have to call getDisplayMedia several times and have to select each of the desired display surfaces separately. This requires multiple user gestures, repeated interaction with browser-level UX. This adds unnecessary friction and results in degraded user experience.

This document describes {{MediaDevices/getDisplayMediaSet}}, an extension to the Screen Capture API [[screen-capture]]. {{MediaDevices/getDisplayMediaSet}} enables the capturing of several of the users user's displays, or parts thereof, with a single user gesture. It returns a list of media streams (each containing a track corresponding to the captured display surfaces).

This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.

Implementations that use ECMAScript [[ECMA-262]] to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [[!WEBIDL]], as this specification uses that specification and terminology.

Example

The following example demonstrates a request for multiple display captures using the navigator.mediaDevices.getDisplayMediaSet method defined in this document.

      try {
        const mediaStreams = await navigator.mediaDevices.getDisplayMediaSet();
        mediaStreams.forEach((mediaStream, index) => {
          videoElements[index].srcObject = mediaStream;
        })
      } catch (e) {
        console.log('Unable to acquire screen captures: ' + e);
      }
    

Terminology

This document uses the definition of {{MediaStream}}, {{MediaStreamTrack}} from [[!GETUSERMEDIA]] and display surface, active user consent from [[!screen-capture]].

Capturing multiple display surfaces

Capture of multiple display surfaces is enabled through the addition of a new {{MediaDevices/getDisplayMediaSet}} method on the {{MediaDevices}} interface.

MediaDevices Additions

          partial interface MediaDevices {
            Promise<sequence<MediaStream>> getDisplayMediaSet();
          };
        
getDisplayMediaSet

Prompts the user for permission to live-capture multiple display surfaces.

The user agent MUST let the end-user choose which display surfaces to share out of all available choices every time.

The user agent MUST return a list of {{MediaStream}} objects, where each stream belongs to exactly one of the display surfaces selected by the user. In case not all the display surfaces selected by the user can be captured, {{MediaDevices/getDisplayMediaSet}} must return a promise rejected with a new {{InvalidStateError}}.

When prompting the user to choose which display surfaces to capture, the user agent MUST allow the user to choose any individual display surface type. However, the user agent MAY restrict the choice so that it would not mix surfaces of a different type.

{{PermissionState/"granted"}} permissions cannot be persisted.

When the {{MediaDevices/getDisplayMediaSet()}} method is called, the user agent MUST run the following steps:

  1. If the [=relevant global object=] of [=this=] does not have [=transient activation=], return a promise rejected with a {{DOMException}} object whose {{DOMException/name}} attribute has the value {{InvalidStateError}}.

  2. If the current settings object's [=relevant global object=]'s [=associated `Document`=] is NOT [=Document/fully active=] or does NOT have focus, return a promise [=reject|rejected=] with a {{DOMException}} object whose {{DOMException/name}} attribute has the value {{InvalidStateError}}.

  3. Let p be a new promise.

  4. Run the following steps in parallel:

    1. Optionally, e.g., based on a previously-established user preference, for security reasons, or due to platform limitations, jump to the step labeled Permission Failure below.

    2. [=Prompt the user to choose=] one or more display surfaces, for a {{PermissionDescriptor}} with its {{PermissionDescriptor/name}} set to "display-capture", resulting in a set of provided media.

      Each provided media MUST include precisely one video track.

      Each provided media MUST not include any audio track.

      The devices chosen MUST be the ones determined by the user. Once selected, the source of each {{MediaStreamTrack}} MUST NOT change, unless the user permits it through their interaction with the user agent.

      User agents are encouraged to warn users against sharing browser display surfaces as well as monitor display surfaces where browser windows are visible, or otherwise try to discourage their selection on the basis that these represent a significantly higher risk when shared.

      If the result of the request is {{PermissionState/"granted"}}, then for each device that is sourcing the provided medias, using a stable and private id for the device, deviceId, set [[\devicesLiveMap]][deviceId] to true, if it isn’t already true, and set the [[\devicesAccessibleMap]][deviceId] to true, if it isn’t already true.

      The user agent MUST NOT store a {{PermissionState/"granted"}} permission entry.

      If the result is {{PermissionState/"denied"}}, jump to the step labeled Permission Failure below. If the user never responds, this algorithm stalls on this step.

      If the user grants permission but a hardware error such as an OS/program/webpage lock prevents access to at least one device, reject p with a new {{DOMException}} object whose {{DOMException/name}} attribute has the value {{NotReadableError}} and abort these steps.

      If the result is {{PermissionState/"granted"}} but device access fails for any reason other than those listed above, reject p with a new {{DOMException}} object whose {{DOMException/name}} attribute has the value {{AbortError}} and abort these steps.

    3. Let streams be the list of {{MediaStream}} objects for which the user granted permission.

    4. Resolve p with streams and abort these steps.

    5. Permission Failure: [=Reject=] p with a new {{DOMException}} object whose {{DOMException/name}} attribute has the value {{NotAllowedError}}.

  5. Return p.

The user agent MUST NOT capture content that's behind a partially transparent captured display surfaces.

For the newly created {{MediaStreamTrack}}s, the user agent MUST NOT capture the prompt that was shown to the user.

Information that is not currently rendered to the screen SHOULD be obscured in captures unless the application has been specifically authorized to access that content (e.g. through means such as elevated permissions).

The user agent MUST NOT share audio.

The specification extends the setions Closed and Minimized Display Surfaces, Unconstrained Display Surface Selection, Constrainable Properties for Captured Display Surfaces , and Device Identifiers of {{MediaDevices/getDisplayMedia()}} to include {{MediaDevices/getDisplayMediaSet()}}.

Permissions Integration

This specification extends the Permissions Integration of {{MediaDevices/getDisplayMedia()}} to include {{MediaDevices/getDisplayMediaSet()}}.

Privacy Indicator Requirements

This specification extends the Privacy Indicator Requirements of {{MediaDevices/getDisplayMedia()}} to include {{MediaDevices/getDisplayMediaSet()}}.

Security and Permissions

This specification refers to the informative text on Security and Permissions considerations of {{MediaDevices/getDisplayMedia()}} which apply to {{MediaDevices/getDisplayMediaSet()}} as well.