Controlled Frame API

Draft Community Group Report,

This version:
TBD
Issue Tracking:
GitHub
Editors:
(Google LLC)
(Google LLC)

Abstract

This document defines an API for embedding arbitrary web content only within the context of an Isolated Web Application (IWA). The embedded content is a new top-level browsing context within and controlled by the embedder.

Status of this document

This specification was published by the Web Platform Incubator Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

1. Introduction

This specification describes a content embedding API that satisfies some critical use cases for IWAs that iframe does not support. This embedding environment should allow embedding all content without express permission from the embedded site, including content which iframe cannot embed, and provide embedding sites more control over that embedded content.

Since this is a particularly powerful API, its use and availability makes an app a target of various types of hacking. As a result, this API is limited to use in Isolated Web Applications (IWAs) which have addtional safeguards in place to protect users and developers. IWAs are not a normal web application and can exist only at a special 'isolated-app:' scheme. This means by design that this API will not be available to normal web pages.

Note: This API is not intended to be a replacement or substitute for iframe. All iframe use cases are still valid and should continue to use iframe, including IWAs where possible.

2. Usage Overview

Lorem ipsum. Insert basic info and example here.

3. Motivating Applications

This section is non-normative.

3.1. Latency-sensitive applications in virtualized sessions

In virtualized environments, users typically have a local thin client that renders a full virtual desktop. The actual desktop execution environment will be running on a remote virtualization server. If the user’s browser navigates to a latency-sensitive application (such as a video app), the rendered content will have additional latency ("lag") that makes the experience difficult or impossible for the user. This also applies for applications that record the user, such as video conferencing applications. In these latency-sensitive applications, the virtual desktop application can render the latency-sensitive content locally and overlay it on top of the rendered remote content to reduce this latency. This use case is also known as "browser content redirection."

3.2. Embedding third party web content without restriction

In a kiosk environment, applications must load content from third parties and display that content on screens within their applications. A teacher may trigger the navigation event, or it may be configured by an administrator such as a shopping mall manager. The content may prohibit embedding by iframe through the use of X-Frame-Options and CSP. An controlled frame, however, should be able to load all content, even content that prohibits embedding by iframe.

3.3. Remote display and manipulation of web content

In a kiosk environment, applications must ensure that content continues to display on screens and may need to interrupt content with their own supplied behaviors. This behavior should work without local attendance by an administrator, and ideally can be managed remotely over the network. If content were to crash, for example, these applications should observe and respond to the crash by reloading the content in a fresh embedded view.

3.4. Clearing user content after each session

In some environments, someone only uses a single device for a brief time to complete their task, like ordering in a restaurant. When their task is complete, the embedder application should be able to clear all of the local user data associated with the task and then restart the embedded instance.

3.5. Monitor for idle sessions

While users interact with embedded content, the user may not explicitly end their session. This content may assume the user is present when they have actually finished or departed without completing the task. Embedder applications want to detect when users idle over their case’s threshold and begin a fresh session.

3.6. Arbitrarily blocking navigations

While displaying embedded web content that’s not authored by the embedder, pages may link to third party web content that’s disallowed. Allowing the embedder to edit elements in embedded content through arbitrary script injection into the web content can ensure navigation cannot occur to blocked pages. The embedder can also use the Controlled Frame API to capture navigation events and ensure that only pages to approved sites can be loaded within that controlled frame.

4. Security, Privacy, and Accessibility Considerations

This section is non-normative.

4.1. Security

Controlled Frame is based upon [Isolated-Web-Apps] (IWA) and integrates with core security specs

Since Controlled Frame is a particularly powerful API, using it or even having it available makes an app a target of various types of hacking. As a result, this API is limited to use in IWA which have additional safeguards in place to protect application developers and users. The Isolated Web App explainer has this to say:

"A user agent may also force an application to adopt this threat model if the developer needs access to APIs which would make the application an appealing target for XSS or server-side attacks."

Controlled Frame makes just such an appealing target, and to expose this with caution we’re opting into IWA to guard against certain attacks. Generally, IWAs provide strong security assurances that each of the resources in an application are secure both at rest and in-transit. You can read more about IWAs security and permissions in the IWA explainer and the IWAs [High-Watermark-Permissions] explainer.

Controlled Frame integrates with [Permissions-Policy] and [Permissions]. You can read more about Permissions Policy §  12. Privacy and Security and Permissions § E Security considerations (note the entry is currently sparse).

Attacking web sites could display content that doesn’t otherwise allow itself to be embedded and trick users on non-IWAs.

Planned mitigation:

An IWA may embed another IWA (or itself) via Controlled Frame to manipulate our IWA policies somehow (e.g. an Controlled Frame embedded IWA may detect it’s being embedded due to the absence of the "controlledframe" policy-controlled feature).

Planned mitigation:

Controlled Frame could gain access to the powerful <controlledframe> element.

An IWA that’s not expected to use Controlled Frame may attempt to embed content.

Planned mitigation:

An IWA may attempt to embed content from non-https schemes, such as 'http:' or 'isolated-app:'

Planned mitigation:

Malicious Controlled Frame could access the embedder’s running process (eg. Spectre attack)

Planned mitigation:

Controlled Frame for a given "https origin" could interact or interfere with the user’s own storage user agent data for that https origin

Planned mitigation:

Malicious Controlled Frame could overwrite embedder’s stored data

Planned mitigation:

Malicious Controlled Frame may detect it is embedded and attempt to attack the embedder application

Planned mitigation:

Ideas:

User may not be able to verify the origin of the page being viewed in the Controlled Frame

Ideas:

Controlled Frame may exploit vulnerabilities in out-of-date browser engine

Already addressed with:

4.2. Privacy

Controlled Frame integrates with Permissions Policy and Permissions. You can read more about Permissions Policy §  12. Privacy and Security. You can read more about Permissions § E Security considerations.

For Controlled Frame specifically, we’ve identified the following privacy considerations:

4.3. Accessibility

For Controlled Frame, we’ve identified the following accessibility considerations:

5. Concepts

6. API

6.1. Controlled Frame HTML Element

[Exposed=Window, SecureContext]
interface ControlledFrame : HTMLElement {
    [HTMLConstructor] constructor();

    [CEReactions] attribute USVString src;
    [CEReactions] attribute DOMString name;
    [CEReactions] attribute boolean allowfullscreen;
    [CEReactions] attribute boolean allowscaling;
    [CEReactions] attribute boolean allowtransparency;
    [CEReactions] attribute boolean autosize;
    [CEReactions] attribute DOMString maxheight;
    [CEReactions] attribute DOMString maxwidth;
    [CEReactions] attribute DOMString minheight;
    [CEReactions] attribute DOMString minwidth;
    attribute DOMString partition;

    readonly attribute WindowProxy? contentWindow;
    readonly attribute ContextMenus contextMenus;

    // Navigation methods.
    Promise<undefined> back();
    boolean canGoBack();
    boolean canGoForward();
    Promise<undefined> forward();
    Promise<undefined> go(long relativeIndex);
    undefined reload();
    undefined stop();

    // Scripting methods.
    undefined addContentScripts(sequence<ContentScriptDetails> contentScriptList);
    Promise<any> executeScript(optional InjectDetails details = {});
    Promise<undefined> insertCSS(optional InjectDetails details = {});
    undefined removeContentScripts(sequence<DOMString>? scriptNameList);

    // Configuration methods.
    Promise<undefined> clearData(
      optional ClearDataOptions options = {},
      optional ClearDataTypeSet types = {});
    Promise<boolean> getAudioState();
    Promise<long> getZoom();
    Promise<boolean> isAudioMuted();
    undefined setAudioMuted(boolean mute);
    Promise<undefined> setZoom(long zoomFactor);

    // Capture methods.
    undefined captureVisibleRegion();
    undefined print();
};

If the "controlled-frame" feature is enabled for an IWA in its manifest, each IWA frame will have access to a ControlledFrame element.

6.2. Navigation methods

go()

Reloads the current page.

go(relativeIndex)

Goes back or forward relativeIndex number of steps in the overall session history entries list for the current traversable navigable.

A zero relative index will reload the current page.

If the relative index is out of range, does nothing.

back()

Goes back one step in the overall session history entries list for the traversable navigable in the Controlled Frame.

If there is no previous page, does nothing.

forward()

Goes forward one step in the overall session history entries list for the traversable navigable in the Controlled Frame.

If there is no next page, does nothing.

canGoBack()

Returns true if the current current session history entry is not the first one in the navigation history entry list. This means that there is a previous session history entry for this navigable.

reload()

Reloads the current page.

stop()

Cancels the document load.

6.3. Scripting methods

// One of |code| or |file| must be specified but not both.
dictionary InjectDetails {
  DOMString code;
  DOMString file;
};

dictionary InjectionItems {
  DOMString code;
  sequence<DOMString> files;
};

enum RunAt {
  "document_start",
  "document_end",
  "document_idle",
};

dictionary ContentScriptDetails {
  boolean all_frames;
  InjectionItems css;
  sequence<DOMString> exclude_globs;
  sequence<DOMString> exclude_matches;
  sequence<DOMString> include_globs;
  InjectionItems js;
  boolean match_about_blank;
  required sequence<DOMString> matches;
  required DOMString name;
  RunAt run_at;
};

6.4. Configuration methods

dictionary ClearDataOptions {
  long since;
};

dictionary ClearDataTypeSet {
  boolean appcache;
  boolean cache;
  boolean cookies;
  boolean fileSystems;
  boolean indexedDB;
  boolean localStorage;
  boolean persistentCookies;
  boolean sessionCookies;
  boolean webSQL;
};

6.5. Capture methods

6.6. Event listener API

7. Controlled Frame API

enum ContextType {
    "all",
    "page",
    "frame",
    "selection",
    "link",
    "editable",
    "image",
    "video",
    "audio",
};

enum ItemType {
    "normal",
    "checkbox",
    "radio",
    "separator",
};

dictionary OnClickData {
    boolean checked;
    required boolean editable;
    long frameId;
    USVString frameUrl;
    USVString linkUrl;
    DOMString mediaType;
    required (DOMString or long) menuItemId;
    USVString pageUrl;
    (DOMString or long) parentMenuId;
    DOMString selectionText;
    USVString srcUrl;
    boolean wasChecked;
};

callback ContextMenusEventListener = undefined (OnClickData data);

dictionary ContextMenusProperties {
    boolean checked;
    sequence<ContextType> context;
    DOMString documentUrlPatterns;
    boolean enabled;
    DOMString parentId;
    DOMString targetUrlPatterns;
    DOMString title;
    ItemType type;
    ContextMenusEventListener onclick;
};

dictionary ContextMenusCreateProperties : ContextMenusProperties {
    DOMString id;
};

callback ContextMenusCallback = undefined ();

[Exposed=Window, SecureContext]
interface ContextMenus {
    // TODO: Define the `onShow` property.

    // Returns the ID of the newly created menu item.
    (DOMString or long) create(
        ContextMenusCreateProperties properties,
        ContextMenusCallback? callback);

    undefined remove(
        (DOMString or long) menuItemId,
        ContextMenusCallback? callback);
    undefined removeAll(ContextMenusCallback? callback);
    undefined update(
        (DOMString or long) id,
        ContextMenusProperties properties,
        ContextMenusCallback? callback);
};

8. Acknowledgements

The following people contributed to the development of this document.

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[Permissions]
Marcos Caceres; Mike Taylor. Permissions. URL: https://w3c.github.io/permissions/
[Permissions-Policy]
Ian Clelland. Permissions Policy. URL: https://w3c.github.io/webappsec-permissions-policy/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[WEBIDL]
Edgar Chen; Timothy Gu. Web IDL Standard. Living Standard. URL: https://webidl.spec.whatwg.org/

Informative References

[High-Watermark-Permissions]
Robbie McElrath. Isolated Web Apps High Watermark Permissions Explainer. URL: https://github.com/WICG/isolated-web-apps/blob/main/Permissions.md
[Isolated-Web-Apps]
Reilly Grant. Isolated Web Apps Explainer. URL: https://github.com/WICG/isolated-web-apps/blob/main/README.md

IDL Index

[Exposed=Window, SecureContext]
interface ControlledFrame : HTMLElement {
    [HTMLConstructor] constructor();

    [CEReactions] attribute USVString src;
    [CEReactions] attribute DOMString name;
    [CEReactions] attribute boolean allowfullscreen;
    [CEReactions] attribute boolean allowscaling;
    [CEReactions] attribute boolean allowtransparency;
    [CEReactions] attribute boolean autosize;
    [CEReactions] attribute DOMString maxheight;
    [CEReactions] attribute DOMString maxwidth;
    [CEReactions] attribute DOMString minheight;
    [CEReactions] attribute DOMString minwidth;
    attribute DOMString partition;

    readonly attribute WindowProxy? contentWindow;
    readonly attribute ContextMenus contextMenus;

    // Navigation methods.
    Promise<undefined> back();
    boolean canGoBack();
    boolean canGoForward();
    Promise<undefined> forward();
    Promise<undefined> go(long relativeIndex);
    undefined reload();
    undefined stop();

    // Scripting methods.
    undefined addContentScripts(sequence<ContentScriptDetails> contentScriptList);
    Promise<any> executeScript(optional InjectDetails details = {});
    Promise<undefined> insertCSS(optional InjectDetails details = {});
    undefined removeContentScripts(sequence<DOMString>? scriptNameList);

    // Configuration methods.
    Promise<undefined> clearData(
      optional ClearDataOptions options = {},
      optional ClearDataTypeSet types = {});
    Promise<boolean> getAudioState();
    Promise<long> getZoom();
    Promise<boolean> isAudioMuted();
    undefined setAudioMuted(boolean mute);
    Promise<undefined> setZoom(long zoomFactor);

    // Capture methods.
    undefined captureVisibleRegion();
    undefined print();
};

// One of |code| or |file| must be specified but not both.
dictionary InjectDetails {
  DOMString code;
  DOMString file;
};

dictionary InjectionItems {
  DOMString code;
  sequence<DOMString> files;
};

enum RunAt {
  "document_start",
  "document_end",
  "document_idle",
};

dictionary ContentScriptDetails {
  boolean all_frames;
  InjectionItems css;
  sequence<DOMString> exclude_globs;
  sequence<DOMString> exclude_matches;
  sequence<DOMString> include_globs;
  InjectionItems js;
  boolean match_about_blank;
  required sequence<DOMString> matches;
  required DOMString name;
  RunAt run_at;
};

dictionary ClearDataOptions {
  long since;
};

dictionary ClearDataTypeSet {
  boolean appcache;
  boolean cache;
  boolean cookies;
  boolean fileSystems;
  boolean indexedDB;
  boolean localStorage;
  boolean persistentCookies;
  boolean sessionCookies;
  boolean webSQL;
};

enum ContextType {
    "all",
    "page",
    "frame",
    "selection",
    "link",
    "editable",
    "image",
    "video",
    "audio",
};

enum ItemType {
    "normal",
    "checkbox",
    "radio",
    "separator",
};

dictionary OnClickData {
    boolean checked;
    required boolean editable;
    long frameId;
    USVString frameUrl;
    USVString linkUrl;
    DOMString mediaType;
    required (DOMString or long) menuItemId;
    USVString pageUrl;
    (DOMString or long) parentMenuId;
    DOMString selectionText;
    USVString srcUrl;
    boolean wasChecked;
};

callback ContextMenusEventListener = undefined (OnClickData data);

dictionary ContextMenusProperties {
    boolean checked;
    sequence<ContextType> context;
    DOMString documentUrlPatterns;
    boolean enabled;
    DOMString parentId;
    DOMString targetUrlPatterns;
    DOMString title;
    ItemType type;
    ContextMenusEventListener onclick;
};

dictionary ContextMenusCreateProperties : ContextMenusProperties {
    DOMString id;
};

callback ContextMenusCallback = undefined ();

[Exposed=Window, SecureContext]
interface ContextMenus {
    // TODO: Define the `onShow` property.

    // Returns the ID of the newly created menu item.
    (DOMString or long) create(
        ContextMenusCreateProperties properties,
        ContextMenusCallback? callback);

    undefined remove(
        (DOMString or long) menuItemId,
        ContextMenusCallback? callback);
    undefined removeAll(ContextMenusCallback? callback);
    undefined update(
        (DOMString or long) id,
        ContextMenusProperties properties,
        ContextMenusCallback? callback);
};