Text Fragments

Draft Community Group Report,

This version:
https://wicg.github.io/scroll-to-text-fragment/
Issue Tracking:
GitHub
Editors:
(Google)
(Google)

Abstract

Text Fragments adds support for specifying a text snippet in the URL fragment. When navigating to a URL with such a fragment, the user agent can quickly emphasise and/or bring it to the user’s attention.

Status of this document

This specification was published by the Web Platform Incubator Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

1. Infrastructure

This specification depends on the Infra Standard. [INFRA]

2. Introduction

This section is non-normative

2.1. Use cases

2.1.1. Web text references

The core use case for text fragments is to allow URLs to serve as an exact text reference across the web. For example, Wikipedia references could link to the exact text they are quoting from a page. Similarly, search engines can serve URLs that direct the user to the answer they are looking for in the page rather than linking to the top of the page.

2.1.2. User sharing

With text fragments, browsers may implement an option to 'Copy URL to here' when the user opens the context menu on a text selection. The browser can then generate a URL with the text selection appropriately specified, and the recipient of the URL will have the specified text conveniently indicated. Without text fragments, if a user wants to share a passage of text from a page, they would likely just copy and paste the passage, in which case the receiver loses the context of the page.

3. Description

3.1. Indication

This section is non-normative

This specification intentionally doesn’t define what actions a user agent should or could take to "indicate" a text match. There are different experiences and trade-offs a user agent could make. Some examples of possible actions:

The choice of action can have implications for user security and privacy. See the § 3.4 Security and Privacy section for details.

3.2. Syntax

This section is non-normative

A text fragment directive is specified in the fragment directive (see § 3.3 The Fragment Directive) with the following format:

#:~:text=[prefix-,]textStart[,textEnd][,-suffix]
          context  |-------match-----|  context

(Square brackets indicate an optional parameter)

The text parameters are percent-decoded before matching. Dash (-), ampersand (&), and comma (,) characters in text parameters must be percent-encoded to avoid being interpreted as part of the text directive syntax.

The only required parameter is textStart. If only textStart is specified, the first instance of this exact text string is the target text.

#:~:text=an%20example%20text%20fragment indicates that the exact text "an example text fragment" is the target text.

If the textEnd parameter is also specified, then the text directive refers to a range of text in the page. The target text range is the text range starting at the first instance of startText, until the first instance of endText that appears after startText. This is equivalent to specifying the entire text range in the startText parameter, but allows the URL to avoid being bloated with a long text directive.

#:~:text=an%20example,text%20fragment indicates that the first instance of "an example" until the following first instance of "text fragment" is the target text.

3.2.1. Context Terms

This section is non-normative

The other two optional parameters are context terms. They are specified by the dash (-) character succeeding the prefix and preceding the suffix, to differentiate them from the textStart and textEnd parameters, as any combination of optional parameters may be specified.

Context terms are used to disambiguate the target text fragment. The context terms can specify the text immediately before (prefix) and immediately after (suffix) the text fragment, allowing for whitespace.

While the context terms must be the immediate text surrounding the target text fragment, any amount of whitespace is allowed between context terms and the text fragment. This helps allow context terms to be across element boundaries, for example if the target text fragment is at the beginning of a paragraph and it must be disambiguated by the previous element’s text as a prefix.

The context terms are not part of the targeted text fragment and must not be visually indicated.

#:~:text=this%20is-,an%20example,-text%20fragment would match to "an example" in "this is an example text fragment", but not match to "an example" in "here is an example text".

3.3. The Fragment Directive

To avoid compatibility issues with usage of existing URL fragments, this spec introduces the fragment directive. The fragment directive is a portion of the URL fragment delimited by the code sequence :~:. It is reserved for UA instructions, such as text=, and is stripped from the URL during loading so that author scripts can’t directly interact with it.

The fragment directive is a mechanism for URLs to specify instructions meant for the UA rather than the document. It’s meant to avoid direct interaction with author script so that future UA instructions can be added without fear of introducing breaking changes to existing content. Potential examples could be: translation-hints or enabling accessibility features.

3.3.1. Parsing the fragment directive

To the definition of Document, add:

Each document has an associated fragment directive which is either null or an ASCII string holding data used by the UA to process the resource. It is initially null.

The fragment directive delimiter is the string ":~:", that is the three consecutive code points U+003A (:), U+007E (~), U+003A (:).

The fragment directive is part of the URL fragment. This means it must always appear after a U+0023 (#) code point in a URL.
To add a fragment directive to a URL like https://example.com, a fragment must first be appended to the URL: https://example.com#:~:text=foo.

Amend the create and initialize a Document object steps to parse and remove the fragment directive from the Document’s URL.

Replace steps 7 and 8 of this algorithm with:

  1. Let url be null

  2. If request is non-null, then set document’s URL to request’s current URL.

  3. Otherwise, set url to response’s URL.

  4. Let raw fragment be equal to url’s fragment.

  5. Let fragmentDirectivePosition be an integer initialized to 0.

  6. While the substring of raw fragment starting at position fragmentDirectivePosition does not begin with the fragment directive delimiter and fragmentDirectivePosition does not point past the end of raw fragment:

    1. Increment fragmentDirectivePosition by 1.

  7. If fragmentDirectivePosition does not point past the end of raw fragment:

    1. Let fragment be the substring of raw fragment starting at 0 of count fragmentDirectivePosition.

    2. Advance fragmentDirectivePosition by the length of fragment directive delimiter.

    3. Let fragment directive be the substring of raw fragment starting at fragmentDirectivePosition.

    4. Set url’s fragment to fragment.

    5. Set document’s fragment directive to fragment directive. (Note: this is stored on the document but not web-exposed)

  8. Set document’s URL to be url.

These changes make a URL’s fragment end at the fragment directive delimiter. The fragment directive includes all characters that follow, but not including, the delimiter.
https://example.org/#test:~:text=foo will be parsed such that the fragment is the string "test" and the fragment directive is the string "text=foo".

To parse a text directive, on a string textDirectiveString, run these steps:

This algorithm takes a single text directive string as input (e.g. "text=prefix-,foo,bar") and attempts to parse the string into the components of the directive (e.g. ("prefix", "foo", "bar", null)). See § 3.2 Syntax for the what each of these components means and how they’re used.

Returns null if the input is invalid or fails to parse in any way. Otherwise, returns a ParsedTextDirective.

  1. Assert: textDirectiveString matches the production TextDirective.

  2. Let textDirectiveString be the substring of text directive input starting at index 5.

    This is the remainder of the text directive input following, but not including, the "text=" prefix.
  3. Let tokens be a list of strings that is the result of splitting textDirectiveString on commas.

  4. If tokens has size less than 1 or greater than 4, return null.

  5. If any of tokens’s items are the empty string, return null.

  6. Let retVal be a ParsedTextDirective with each of its items initialized to null.

  7. Let potential prefix be the first item of tokens.

  8. If the last character of potential prefix is U+002D (-), then:

    1. Set retVal’s prefix to the result of removing the last character from potential prefix.

    2. Remove the first item of the list tokens.

  9. Let potential suffix be the last item of tokens, if one exists, null otherwise.

  10. If potential suffix is non-null and its first character is U+002D (-), then:

    1. Set retVal’s suffix to the result of removing the first character from potential suffix.

    2. Remove the last item of the list tokens.

  11. If tokens has size not equal to 1 nor 2 then return null.

  12. Set retVal’s textStart be the first item of tokens.

  13. If tokens has size 2, then set retVal’s textEnd be the last item of tokens.

  14. Return retVal.

A ParsedTextDirective is a struct that consists of four strings: textStart, textEnd, prefix, and suffix. textStart is required to be non-null. The other three items may be set to null, indicating they weren’t provided. The empty string is not a valid value for any of these items.

See § 3.2 Syntax for the what each of these components means and how they’re used.

3.3.2. Fragment directive grammar

A valid fragment directive is a sequence of characters that appears in the fragment directive that matches the production:

FragmentDirective ::=
(TextDirective | UnknownDirective) ("&" FragmentDirective)?
UnknownDirective ::=
CharacterString
CharacterString ::=
(ExplicitChar | PercentEncodedChar)+
ExplicitChar ::=
[a-zA-Z0-9] | "!" | "$" | "'" | "(" | ")" | "*" | "+" | "." | "/" | ":" | ";" | "=" | "?" | "@" | "_" | "~" | "&" | "," | "-"
An ExplicitChar may be any URL code point.
The FragmentDirective may contain multiple directives split by the "&" character. Currently this means we allow multiple text directives to enable multiple indicated strings in the page, but this also allows for future directive types to be added and combined. For extensibility, we do not fail to parse if an unknown directive is in the &-separated list of directives.

The text fragment directive is one such fragment directive that enables specifying a piece of text on the page, that matches the production:

TextDirective ::=
"text=" TextDirectiveParameters

TextDirectiveParameters ::=
(TextDirectivePrefix ",")? TextDirectiveString ("," TextDirectiveString)? ("," TextDirectiveSuffix)?

TextDirectivePrefix ::=
TextDirectiveString"-"

TextDirectiveSuffix ::=
"-"TextDirectiveString

TextDirectiveString ::=
(TextDirectiveExplicitChar | PercentEncodedChar)+

TextDirectiveExplicitChar ::=
[a-zA-Z0-9] | "!" | "$" | "'" | "(" | ")" | "*" | "+" | "." | "/" | ":" | ";" | "=" | "?" | "@" | "_" | "~"
A TextDirectiveExplicitChar may be any URL code point that is not explicitly used in the TextDirective syntax, that is "&", "-", and ",", which must be percent-encoded.

PercentEncodedChar ::=
"%" [a-zA-Z0-9]+

3.4. Security and Privacy

3.4.1. Motivation

This section is non-normative

Care must be taken when implementing text fragment directive so that it cannot be used to exfiltrate information across origins. Scripts can navigate a page to a cross-origin URL with a text fragment directive. If a malicious actor can determine that the text fragment was successfully found in victim page as a result of such a navigation, they can infer the existence of any text on the page.

In addition, the user’s privacy should be ensured even from the destination origin. Although scripts on that page can already learn a lot about a user’s actions, a text fragment directive can still contain sensitive information. For this reason, this specification provides no way for a page to extract the content of the text fragment anchor. User agents must not expose this information to the page.

A user visiting a page listing dozens of medical conditions may have gotten there via a link with a text fragment directive containing a specific condition. This information must not be shared with the page.
TODO: This last paragraph and example are probably not be necessary - the page can already determine what the user is looking at based on the viewport rect. It may not be desirable since it would prevent use cases like marginalia, allowing pages to provide UA and linking based on the text fragment.

The following subsections restrict the feature to mitigate the expected attack vectors. In summary, the text fragment directives are invoked only on full (non-same-page) navigations that are the result of a user activation. Additionally, navigations originating from a different origin than the destination will require the navigation to take place in a "noopener" context, such that the destination page is known to be sufficiently isolated.

3.4.2. Scroll On Navigation

A UA may choose to automatically scroll a matched text passage into view. This can be a convenient experience for the user but does present some risks that implementing UAs should be aware of.

There are known (and potentially unknown) ways a scroll on navigation might be detectable and distinguished from natural user scrolls.

An origin embedded in an iframe in the target page registers an IntersectionObserver and determines in the first 500ms of page load whether a scroll has occurred. This scroll can be indicative of whether the text fragment was successfully found on the page.
Two users share the same network on which traffic is visible between them. A malicious user sends the victim a link with a text fragment to a page. The searched-for text appears nearby to a resource located on a unique (on the page) domain. The attacker may be able to infer the success or failure of the fragment search based on the order of requests for DNS lookup.
A malicious page embeds a cross-origin victim in an iframe. The victim page contains information sensitive to the user. The malicious page navigates the victim to a text fragment. Since a successful fragment match will cause focus, the malicious page can determine if the text appears in the victim by listening for a blur event in its own document.
An attacker sends a link to a victim, sending them to a page that displays a private token. The attacker asks the victim to read back the token. Using a text fragment, the attacker gets the page to load for the victim such that warnings about keeping the token secret are scrolled out of view.

All known cases like this rely on specific circumstances about the target page so don’t apply generally. With additional restrictions about when the text fragment can invoke an attacker is further restricted. Nonetheless, different UAs can come to different conclusions about whether these risks are acceptable. UAs should consider these factors when determining whether to scroll as part of navigating to a text fragment.

Conforming UAs may choose not to scroll automatically on navigation. Such UAs may, instead, provide UI to initiate the scroll ("click to scroll") or none at all. In these cases UA should provide some indication to the user that an indicated passage exists further down on the page.

The examples above illustrate that in specific circumstances, it may be possible for an attacker to extract 1 bit of information about content on the page. However, care must be taken so that such opportunities cannot be exploited to extract arbitrary content from the page by repeating the attack. For this reason, restrictions based on user activation and browsing context isolation are very important and must be implemented.

Browsing context isolation ensures that no other document can script the target document which helps reduce the attack surface.

However, it also ensures any malicious use is difficult to hide. A browsing context that’s the only one in a group must be a top level browsing context (i.e. a full tab/window).

If a UA does choose to scroll automatically, it must ensure no scrolling is performed while the document is in the background (for example, in an inactive tab). This ensures any malicious usage is visible to the user and prevents attackers from trying to secretly automate a search in background documents.

3.4.3. Search Timing

A naive implementation of the text search algorithm could allow information exfiltration based on runtime duration differences between a matching and non- matching query. If an attacker could find a way to synchronously navigate to a text fragment directive-invoking URL, they would be able to determine the existence of a text snippet by measuring how long the navigation call takes.

The restrictions in § 3.4.4 Restricting the Text Fragment should prevent this specific case; in particular, the no-same-document-navigation restriction. However, these restrictions are provided as multiple layers of defence.

For this reason, the implementation must ensure the runtime of § 3.5 Navigating to a Text Fragment steps does not differ based on whether a match has been successfully found.

This specification does not specify exactly how a UA achieves this as there are multiple solutions with differing tradeoffs. For example, a UA may continue to walk the tree even after a match is found in find a range from a text directive. Alternatively, it may schedule an asynchronous task to find and set the indicated part of the document.

3.4.4. Restricting the Text Fragment

To determine whether a navigation should allow a text fragment, given as input a boolean is user triggered, an origin incumbentNavigationOrigin, and Document document; follow these steps:

TODO: This should really only prevent potentially observable side-effects like automatic scrolling. Unobservable effects like a highlight should be safely allowed in all cases.
  1. If incumbentNavigationOrigin is null, return true.

    If a navigation originates from browser UI, it’s always ok to allow it since it’ll be user triggered and the page/script isn’t providing the text snippet.

    Note: Depending on the UA, there may be cases where the incumbentNavigationOrigin is null but it’s not clear that the navigation should be considered as initiated from browser UI. E.g. an "open in new window" context menu item when right clicking on a link. The intent in this item is to distinguish cases where the app/page is able to set the URL from those that are fully under the user’s control. In the former we want to prevent activation of the text fragment unless the destination is loaded in a separate browsing context group (so that the source cannot both control teh text snippet and observe side-effects in the navigation).

    TODO: This seems to be very similar to sec-fetch-site so we may wish to integrate this with how that’s specified.

  2. If is user triggered is false, return false.

  3. If the document of the latest entry in document’s browsing context's session history is equal to document, return false.

    i.e. Forbidden on a same-document navigation.
  4. If incumbentNavigationOrigin is equal to the origin of document return true.

  5. If document’s browsing context is a top-level browsing context and its group’s browsing context set has length 1 return true.

    i.e. Only allow navigation from a cross-process element/script if the document is loaded in a noopener context. That is, a new top level browsing context group to which the navigator does not have script access and which may be placed into a separate process.
  6. Otherwise, return false.

To set the allowTextFragmentDirective flag, follow these steps:

The algorithm to determine whether or not a text fragment directive should be allowed to invoke must be run during document navigation and creation and stored as a flag since it relies on the properties of the navigation while the invocation will occur as part of the scroll to the fragment steps which can happen outside the context of a navigation.

Amend the page load processing model for HTML files to insert these steps after step 1:

  1. Let is user activated be true if the current navigation was initiated from a window that had a transient activation at the time the navigation was initiated.

    TODO: This requires tracking the user activation state through a navigation which is currently unspecified. Something along these lines is being specified in Sec-Fetch-User. Perhaps we could generalize that for use here and elsewhere?
  2. Set document’s allowTextFragmentDirective flag to the result of running should allow a text fragment with is user activated, incumbentNavigationOrigin, and the document.

Amend the try to scroll to the fragment steps by replacing the steps of the task queued in step 2:

  1. If document has no parser, or its parser has stopped parsing, or the user agent has reason to believe the user is no longer interested in scrolling to the fragment, then clear document’s allowTextFragmentDirective flag and abort these steps.

  2. Scroll to the fragment given in document’s URL. If this does not find an indicated part of the document, then try to scroll to the fragment for document.

  3. Clear document’s allowTextFragmentDirective flag

The text fragment specification proposes an amendment to HTML 5 §7.10.9 Navigating to a fragment. In summary, if a text fragment directive is present and a match is found in the page, the text fragment takes precedent over the element fragment as the indicated part of the document. We amend the indicated part of the document to optionally include a range that may be scrolled into view instead of the containing element.

Replace step 3.1 of the scroll to the fragment algorithm with the following:

  1. Otherwise:

    1. Let target, range be the element and range that is the indicated part of the document.

Replace step 3.3 of the scroll to the fragment algorithm with the following:

  1. Otherwise:

    1. If range is non-null:

      1. If the UA supports scrolling of text fragments on navigation, invoke Scroll range into view, with containingElement target, behavior set to "auto", block set to "center", and inline set to "nearest".

    2. Otherwise:

      1. Scroll target into view, with behavior set to "auto", block set to "start", and inline set to "nearest".

        This otherwise case is the same as the current step 3.3.

Add the following steps to the beginning of the processing model for the indicated part of the document:

  1. Let fragment directive string be the document’s fragment directive.

  2. If the document’s allowTextFragmentDirective flag is true then:

    1. Let ranges be a list that is the result of running the process a fragment directive steps with fragment directive string and the document.

    2. If ranges is non-empty, then:

      1. Let range be the first item of ranges.

        The first range in ranges is specifically scrolled into view. This range, along with the remaining ranges should be visually indicated in a way that is not revealed to script, which is left as UA-defined behavior.
      2. Let node be the first common ancestor of range’s start node and start node.

      3. While node is not an element, set node to node’s parent.

      4. The indicated part of the document is node and range; return.

To find the first common ancestor of two nodes nodeA and nodeB, follow these steps:

  1. Let commonAncestor be nodeA.

  2. While commonAncestor is not a shadow-including inclusive ancestor of nodeB, let commonAncestor be commonAncestor’s shadow-including parent.

  3. Return commonAncestor.

To find the shadow-including parent of node follow these steps:

  1. If node is a shadow root, return node’s host.

  2. Otherwise, return node’s parent.

3.5.1. Scroll a DOMRect into view

This section describes a refactoring of the CSSOMVIEW’s scroll an element into view algorithm to separate the steps for scrolling a DOMRect into view, so it can be used to scroll a Range into view.

Move the scroll an element into view algorithm’s steps 3-14 into a new algorithm scroll a DOMRect into view, with input DOMRect bounding box, ScrollIntoViewOptions dictionary options, and element startingElement.

Also move the recursive behavior described at the top of the scroll an element into view algorithm to the scroll a DOMRect into view algorithm: "run these steps for each ancestor element or viewport of startingElement that establishes a scrolling box scrolling box, in order of innermost to outermost scrolling box".

bounding box is renamed from element bounding border box.

Replace steps 3-14 of the scroll an element into view algorithm with a call to scroll a DOMRect into view:

  1. Perform scroll a DOMRect into view given element bounding border box, options and element.

Define a new algorithm scroll a Range into view, with input range range, element containingElement, and a ScrollIntoViewOptions dictionary options:

  1. Let bounding rect be the DOMRect that is the return value of invoking getBoundingClientRect() on range.

  2. Perform scroll a DOMRect into view given bounding rect, options, and containingElement.

3.5.2. Finding Ranges in a Document

This section outlines several algorithms and definitions that specify how to turn a full fragment directive string into a list of Ranges in the document.

At a high level, we take a fragment directive string that looks like this:

text=prefix-,foo&unknown&text=bar,baz

We break this up into the individual text directives:

text=prefix-,foo
text=bar,baz

For each text directive, we perform a search in the document for the first instance of rendered text that matches the restrictions in the directive. Each search is independent of any others; that is, the result is the same regardless of how many other directives are provided or their match result.

If a directive successfully matches to text in the document, it returns a range indicating that match in the document. The process a fragment directive steps are the high level API provided by this section. These return a list of ranges that were matched by the individual directive matching steps, in the order the directives were specified in the fragment directive string.

If a directive was not matched, it does not add an item to the returned list.

To process a fragment directive, given as input a string fragment directive input and a Document document, run these steps:

This algorithm takes as input a the fragment directive input, that is the raw text of the fragment directive and the document over which it operates. It returns a list of ranges that are to be visually indicated, the first of which may be scrolled into view (if the UA scrolls automatically).
  1. If fragment directive input is not a valid fragment directive, then return an empty list.

  2. Let directives be a list of strings that is the result of strictly splitting the string fragment directive input on "&".

  3. Let ranges be a list of ranges, initially empty.

  4. For each string directive of directives:

    1. If directive does not match the production TextDirective, then continue.

    2. Let parsedValues be the result of running the parse a text directive steps on directive.

    3. If parsedValues is null then continue.

    4. If the result of running find a range from a text directive given parsedValues and document is non-null, then append it to ranges.

  5. Return ranges.

To find a range from a text directive, given a ParsedTextDirective parsedValues and Document document, run the following steps:

This algorithm takes as input a successfully parsed text directive and a document in which to search. It returns a range that points to the first text passage within the document that matches the searched-for text and satisfies the surrounding context. Returns null if no such passage exists.

textEnd may be null. If omitted, this is an "exact" search and the returned range must contain a string exactly matching textStart. If textEnd is provided, this is a "range" search; the returned range must start with textStart and end with textEnd. In the normative text below, we’ll call a text passage that matches the provided textStart and textEnd, regardless of which mode we’re in, the "matching text".

Either or both of prefix and suffix may be null, in which case context on that side of a match is not checked. E.g. If prefix is null, text is matched without any requirement on what text precedes it.

While the matching text and its prefix/suffix can span across block-boundaries, the individual parameters to these steps cannot. That is, each of prefix, textStart, textEnd, and suffix will only match text within a single block.
:~:text=The quick,lazy dog
will fail to match in
    <div>The<div> </div>quick brown fox</div>
    <div>jumped over the lazy dog</div>

because the starting string "The quick" does not appear within a single, uninterrupted block. The instance of "The quick" in the document has a block element between "The" and "quick".

It does, however, match in this example:

    <div>The quick brown fox</div>
    <div>jumped over the lazy dog</div>
  1. Let searchRange be a range with start (document, 0) and end (document, document’s length)

  2. While searchRange is not collapsed:

    1. Let potentialMatch be null.

    2. If parsedValues’s prefix is not null:

      1. Let prefixMatch be the the result of running the find a string in range steps given parsedValues’s prefix and searchRange.

      2. If prefixMatch is null, return null.

      3. Set searchRange’s start to the first boundary point after prefixMatch’s start

      4. Let matchRange be a range whose start is prefixMatch’s end and end is searchRange’s end.

      5. Advance matchRange’s start to the next non-whitespace position.

      6. If matchRange is collapsed return null.

        This can happen if prefixMatch’s end or its subsequent non-whitespace position is at the end of the document.
      7. Assert: matchRange’s start node is a Text node.

        matchRange’s start now points to the next non-whitespace text data following a matched prefix.
      8. Set potentialMatch to the result of running the find a string in range steps given parsedValues’s textStart and matchRange.

      9. If potentialMatch is null, return null.

      10. If potentialMatch’s start is not matchRange’s start, then and continue.

        In this case, we found a prefix but it was followed by something other than a matching text so we’ll continue searching for the next instance of prefix.
      11. If parsedValues’s textEnd item is non-null, then:

        1. Let textEndRange be a range whose start is potentialMatch’s end and whose end is searchRange’s end.

        2. Let textEndMatch be the result of running the find a string in range steps given parsedValue’s textEnd and textEndRange.

        3. If textEndMatch is null then return null.

        4. Set potentialMatch’s end to textEndMatch’s end.

    3. Otherwise:

      1. Set potentialMatch to the result of running the find a string in range steps given parsedValues’s textStart and searchRange.

      2. If potentialMatch is null, return null.

      3. Set searchRange’s start to the first boundary point after potentialMatch’s start

      4. If parsedValues’s textEnd item is non-null, then:

        1. Let textEndRange be a range whose start is potentialMatch’s end and whose end is searchRange’s end.

        2. Let textEndMatch be the result of running the find a string in range steps given parsedValue’s textEnd and textEndRange.

        3. If textEndMatch is null then return null.

        4. Set potentialMatch’s end to textEndMatch’s end.

    4. Assert: potentialMatch is non-null, not collapsed and represents a range exactly containing an instance of matching text.

    5. If parsedValues’s suffix is null, return potentialMatch.

    6. Let suffixRange be a range with start equal to potentialMatch’s end and end equal to searchRange’s end.

    7. Advance suffixRange’s start to the next non-whitespace position.

    8. Let suffixMatch be result of running the find a string in range steps given parsedValues’s suffix and suffixRange.

    9. If suffixMatch is null then return null.

      If the suffix doesn’t appear in the remaining text of the document, there’s no possible way to make a match.
    10. If suffixMatch’s start is suffixRange’s start, return potentialMatch.

To advance a range range’s start to the next non-whitespace position follow the steps:

  1. While range is not collapsed:

    1. Let node be range’s start node.

    2. Let offset be range’s start offset.

    3. If node is part of a non-searchable subtree then:

      1. Set range’s start node to the next node, in shadow-including tree order, that isn’t a shadow-including descendant of node.

      2. Continue.

    4. If node is not a visible text node:

      1. Set range’s start node to the next node, in shadow-including tree order.

      2. Continue.

    5. If the substring data of node at offset offset and count 6 is equal to the string "&nbsp;" then:

      1. Add 6 to range’s start offset.

    6. Otherwise, if the substring data of node at offset offset and count 5 is equal to the string "&nbsp" then:

      1. Add 5 to range’s start offset.

    7. Otherwise:

      1. Let cp be the code point at the offset index in node’s data.

      2. If cp does not have the White_Space property set, return.

      3. Add 1 to range’s start offset.

    8. If range’s start offset is equal to node’s length, set range’s start node to the next node in shadow-including tree order.

To find a string in range for a string query in a given range range, run these steps:

This algorithm will return a range that represents the first word bounded instance fully contained within range of the query text. Returns null if none is found.

The basic premise of this algorithm is to walk all searchable text nodes within a block, collecting them into a list. The list is then concatenated into a single string in which we can search, using the node list to determine offsets with a node so we can return a range.

Collection breaks when we hit a block node, e.g. searching over this tree:

      <div>
        a<em>b</em>c<div>d</div>e
      </div>

Will perform a search on "abc", then on "d", then on "e".

Thus, query will only match text that is continuous (i.e. uninterrupted by a block-level container) within a single block-level container.

  1. While searchRange is not collapsed:

    1. Let curNode be searchRange’s start node.

    2. If curNode is part of a non-searchable subtree:

      1. Set searchRange’s start node to the next node, in shadow-including tree order, that isn’t a shadow-including descendant of curNode.

      2. Continue.

    3. If curNode is not a visible text node:

      1. Set searchRange’s start node to the next node, in shadow-including tree order.

      2. Continue.

    4. Otherwise:

      1. Let blockAncestor be the nearest block ancestor of curNode.

      2. Let textNodeList be a list of Text nodes, initially empty.

      3. While curNode is a shadow-including descendant of blockAncestor and it does not follow searchRange’s end node:

        1. If curNode has block-level display then break.

        2. If curNode is search invisible:

          1. Set curNode to the next node in shadow-including tree order whose ancestor is not curNode.

          2. Continue.

        3. If curNode is a visible text node then append it to textNodeList.

        4. Set curNode to the next node in shadow-including tree order.

      4. Run the find a range from a node list steps given query, searchRange, and textNodeList, as input. If the resulting range is not null, then return it.

      5. Assert: curNode follows searchRange’s start node.

      6. Set searchRange’s start to the boundary point (curNode, 0).

  2. Return null.

A node is search invisible if it is in the HTML namespace and meets any of the following conditions:

  1. The computed value of its display property is none.

  2. If the node serializes as void.

  3. Is any of the following types: HTMLIFrameElement, HTMLImageElement, HTMLMeterElement, HTMLObjectElement, HTMLProgressElement, HTMLStyleElement, HTMLScriptElement, HTMLVideoElement, HTMLAudioElement

  4. Is a select element whose multiple content attribute is absent.

A node is part of a non-searchable subtree if it is or has an ancestor that is search invisible.

A node is a visible text node if it is a Text node, the computed value of its visibility property is visible, and it is being rendered.

A node has block-level display if the computed value of its display property is any of block, table, flow-root, grid, flex, list-item.

To find the nearest block ancestor of a node follow the steps:

  1. While node is non-null

    1. If node is not a Text node and it has block-level display then return node.

    2. Otherwise, set node to node’s parent.

  2. Return node’s node document's document element.

To find a range from a node list given a search string queryString, a range searchRange, and a list of nodes nodes, follow the steps

This will only return a match if the matched text falls on word boundaries. That is, the text match must begin and end on a word boundary. For example:
“range” will match in “mountain range” but not in “color orange” nor “forest ranger”.

See § 3.5.3 Word Boundaries for details and more examples.

  1. Assert: each item in nodes is a Text node.

  2. Let searchBuffer be the concatenation of the data of each item in in nodes.

  3. Let searchStart be 0.

  4. If the first item in nodes is searchRange’s start node then set searchStart to searchRange’s start offset.

  5. Let start and end be boundary points, initially null.

  6. Let matchIndex be null.

  7. While matchIndex is null

    1. Let matchIndex be an integer set to the index of the first instance of queryString in searchBuffer, starting at searchStart. The string search must be performed using a base character comparison, or the primary level, as defined in [UTS10].

      Intuitively, this is a case-insensitive search also ignoring accents and other marks.
    2. Let endIx be matchIndex + queryString’s length.

      endIx is the index of the last character in the match + 1.
    3. Set start be the boundary point result of get boundary point at index matchIndex run over nodes with isEnd false.

    4. Set end be the boundary point result of get boundary point at index endIx run over nodes with isEnd true.

    5. If the substring of searchBuffer starting at matchIndex and of length queryString’s length is not word bounded, given the language from each of start and end’s nodes as the startLocale and endLocale:

      1. Let searchStart be matchIndex + 1.

      2. Set matchIndex to null.

  8. Let endInset be 0.

  9. If the last item in nodes is searchRange’s end node then set endInset to (searchRange’s end node's lengthsearchRange’s end offset)

    endInset is the offset from the last position in the last node in the reverse direction. Alternatively, it is the length of the node that’s not included in the range.
  10. If matchIndex + queryString’s length is greater than or equal to searchBuffer’s length − endInset return null.

    If the match runs past the end of the search range, return null.
  11. Assert: start and end are non-null, valid boundary points in searchRange.

  12. Return a range with start start and end end.

To get boundary point at index, given an integer index, list of Text nodes nodes, and a boolean isEnd, follow these steps:

This is a small helper routine used by the steps above to determine which node a given index in the concatenated string belongs to.

isEnd is used to differentiate start and end indices. An end index points to the "one-past-last" character of the matching string. If the match ends at node boundary, we want the end offset to remain within that node, rather than the start of the next node.

  1. Let counted be 0.

  2. For each curNode of nodes:

    1. Let nodeEnd be counted + curNode’s length.

    2. If isEnd is true, add 1 to nodeEnd.

    3. If nodeEnd is greater than index then:

      1. Return the boundary point (curNode, indexcounted).

    4. Increment counted by curNode’s length.

  3. Return null.

3.5.3. Word Boundaries

Limiting matching to word boundaries is one of the mitigations to limit cross-origin information leakage.
See Intl.Segmenter, a proposal to specify unicode segmentation, including word segmentation. Once specified, this algorithm may be improved by making use of the Intl.Segmenter API for word boundary matching.

A word boundary is defined in [UAX29] in Unicode Text Segmentation §Word_Boundaries. Unicode Text Segmentation §Default_Word_Boundaries defines a default set of what constitutes a word boundary, but as the specification mentions, a more sophisticated algorithm should be used based on the locale.

Dictionary-based word bounding should take specific care in locales without a word-separating character. E.g. In English, words are separated by the space character (' '); however, in Japanese there is no character that separates one word from the next. In such cases, and where the alphabet contains fewer than 100 characters, the dictionary must not contain more than 20% of the alphabet as valid, one-letter words.

To determine if a substring of a larger string is word bounded, given a string text, an integer startPosition, number count, and locales startLocale and endLocale, follow these steps:

startLocale and endLocale must be a valid [BCP47] language tag, or the empty string. An empty string indicates that the primary language is unknown.

startPosition and count represent a substring in text. startLocale and endLocale specifying the language of the string at each end of the match.

startPosition and count are assumed to be valid in that they represent a substring within the bounds of text.

Intuitively, a substring is word bounded if it neither begins nor ends in the middle of a word.

In languages with a word separator (e.g. " " space) this is (mostly) straightforward; though there are details covered by the above technical reports such as new lines, hyphenations, quotes, etc.

Some languages do not have such a separator (notably, Chinese/Japanese/Korean). Languages such as these requires dictionaries to determine what a valid word in the given locale is.

  1. Using locale startLocale, let left bound be the last word boundary in text that precedes startPositionth code point of text.

    A string will always contain at least 2 word boundaries before the first code point and after the last code point of the string.
  2. If the first code point of text following left bound is not at position startPosition return false.

  3. Let endPosition be (startPosition + count − 1).

  4. Using locale endLocale, let right bound be the first word boundary in text after the endPositionth code point.

  5. If the first code point of text preceding right bound is not at position endPosition return false.

  6. Return true.

The substring "mountain range" is word bounded within the string "An impressive mountain range" but not within "An impressive mountain ranger".
In the Japanese string "ウィキペディアへようこそ" (Welcome to Wikipedia), "ようこそ" (Welcome) is considered word-bounded but "ようこ" is not.

3.6. Indicating The Text Match

The UA may choose to scroll the text fragment into view as part of the try to scroll to the fragment steps or by some other mechanism; however, it is not required to scroll the match into view.

The UA should visually indicate the matched text in some way such that the user is made aware of the text match, such as with a high-contrast highlight.

The UA should provide to the user some method of dismissing the match, such that the matched text no longer appears visually indicated.

The exact appearance and mechanics of the indication are left as UA-defined. However, the UA must not use the Document’s selection to indicate the text match as doing so could allow attack vectors for content exfiltration.

The UA must not visually indicate any provided context terms.

3.7. Feature Detectability

For feature detectability, we propose adding a new FragmentDirective interface that is exposed via window.location.fragmentDirective if the UA supports the feature.

[Exposed=Window]
interface FragmentDirective {
};

We amend the Location interface to include a fragmentDirective property:

partial interface Location {
    [SameObject] readonly attribute FragmentDirective fragmentDirective;
};

4. Generating Text Fragment Directives

This section is non-normative.

This section contains recommendations for UAs automatically generating URLs with a text fragment directive. These recommendations aren’t normative but are provided to ensure generated URLs result in maximally stable and usable URLs.

4.1. Prefer Exact Matching To Range-based

The match text can be provided either as an exact string "text=foo%20bar%20baz" or as a range "text=foo,bar".

UAs should prefer to specify the entire string where practical. This ensures that if the destination page is removed or changed, the intended destination can still be derived from the URL itself.

Suppose we wish to craft a URL to https://en.wikipedia.org/wiki/History_of_computing quoting the sentence:
The first recorded idea of using digital electronics for computing was the
1931 paper "The Use of Thyratrons for High Speed Automatic Counting of
Physical Phenomena" by C. E. Wynn-Williams.

We could create a range-based match like so:

https://en.wikipedia.org/wiki/History_of_computing#:~:text=The%20first%20recorded,Williams

Or we could encode the entire sentence using an exact match term:

https://en.wikipedia.org/wiki/History_of_computing#:~:text=The%20first%20recorded%20idea%20of%20using%20digital%20electronics%20for%20computing%20was%20the%201931%20paper%20%22The%20Use%20of%20Thyratrons%20for%20High%20Speed%20Automatic%20Counting%20of%20Physical%20Phenomena%22%20by%20C.%20E.%20Wynn-Williams

The range-based match is less stable, meaning that if the page is changed to include another instance of "The first recorded" somewhere earlier in the page, the link will now target an unintended text snippet.

The range-based match is also less useful semantically. If the page is changed to remove the sentence, the user won’t know what the intended target was. In the exact match case, the user can read, or the UA can surface, the text that was being searched for but not found.

Range-based matches can be helpful when the quoted text is excessively long and encoding the entire string would produce an unwieldy URL.

It is recommended that text snippets shorter than 300 characters always be encoded using an exact match. Above this limit, the UA should encode the string as a range-based match.

TODO: Can we determine the above limit in some less arbitrary way?

4.2. Use Context Only When Necessary

Context terms allow the text fragment directive to disambiguate text snippets on a page. However, their use can make the URL more brittle in some cases. Often, the desired string will start or end at an element boundary. The context will therefore exist in an adjacent element. Changes to the page structure could invalidate the text fragment directive since the context and match text may no longer appear to be adjacent.

Suppose we wish to craft a URL for the following text:
<div class="section">HEADER</div>
<div class="content">Text to quote</div>

We could craft the text fragment directive as follows:

text=HEADER-,Text%20to%20quote

However, suppose the page changes to add a "[edit]" link beside all section headers. This would now break the URL.

Where a text snippet is long enough and unique, a UA should prefer to avoid adding superfluous context terms.

It is recommended that context should be used only if one of the following is true:

TODO: Determine the numeric limit above in less arbitrary way.

4.3. Determine If Fragment Id Is Needed

When the UA navigates to a URL containing a text fragment directive, it will fallback to scrolling into view a regular element-id based fragment if it exists and the text fragment isn’t found.

This can be useful to provide a fallback, in case the text in the document changes, invalidating the text fragment directive.

Suppose we wish to craft a URL to https://en.wikipedia.org/wiki/History_of_computing quoting the sentence:
The earliest known tool for use in computation is the Sumerian abacus

By specifying the section that the text appears in, we ensure that, if the text is changed or removed, the user will still be pointed to the relevant section:

https://en.wikipedia.org/wiki/History_of_computing#Early_computation:~:text=The%20earliest%20known%20tool%20for%20use%20in%20computation%20is%20the%20Sumerian%20abacus

However, UAs should take care that the fallback element-id fragment is the correct one:

Suppose the user navigates to https://en.wikipedia.org/wiki/History_of_computing#Early_computation. They now scroll down to the Symbolic Computations section. There, they select a text snippet and choose to create a URL to it:
By the late 1960s, computer systems could perform symbolic algebraic
manipulations

The UA should note that, even though the current URL of the page is: https://en.wikipedia.org/wiki/History_of_computing#Early_computation, using #Early_computation as a fallback is inappropriate. If the above sentence is changed or removed, the page will load in the #Early_computation section which could be quite confusing to the user.

If the UA cannot reliably determine an appropriate fragment to fallback to, it should remove the fragment id from the URL:

https://en.wikipedia.org/wiki/History_of_computing#:~:text=By%20the%20late%201960s,%20computer%20systems%20could%20perform%20symbolic%20algebraic%20manipulations

If a UA chooses not to scroll text fragments into view on navigation (reasons why a UA may make this choice are discussed in § 3.4 Security and Privacy), it must scroll the element-id into view, if provided, regardless of whether a text fragment was matched. Not doing so would allow detecting the text fragment match based on whether the element-id was scrolled.

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[CSS-CASCADE-4]
Elika Etemad; Tab Atkins Jr.. CSS Cascading and Inheritance Level 4. 28 August 2018. CR. URL: https://www.w3.org/TR/css-cascade-4/
[CSS-DISPLAY-3]
Tab Atkins Jr.; Elika Etemad. CSS Display Module Level 3. 19 May 2020. CR. URL: https://www.w3.org/TR/css-display-3/
[CSS2]
Bert Bos; et al. Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification. 7 June 2011. REC. URL: https://www.w3.org/TR/CSS2/
[CSSOM-VIEW-1]
Simon Pieters. CSSOM View Module. 17 March 2016. WD. URL: https://www.w3.org/TR/cssom-view-1/
[DOM]
Anne van Kesteren. DOM Standard. Living Standard. URL: https://dom.spec.whatwg.org/
[FETCH]
Anne van Kesteren. Fetch Standard. Living Standard. URL: https://fetch.spec.whatwg.org/
[GEOMETRY-1]
Simon Pieters; Chris Harrelson. Geometry Interfaces Module Level 1. 4 December 2018. CR. URL: https://www.w3.org/TR/geometry-1/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[UAX29]
Mark Davis; Christopher Chapman. Unicode Text Segmentation. 19 February 2020. Unicode Standard Annex #29. URL: https://www.unicode.org/reports/tr29/tr29-37.html
[URL]
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/
[UTS10]
Ken Whistler; Markus Scherer. Unicode Collation Algorithm. 7 February 2020. Unicode Technical Standard #10. URL: https://www.unicode.org/reports/tr10/tr10-43.html
[WebIDL]
Boris Zbarsky. Web IDL. 15 December 2016. ED. URL: https://heycam.github.io/webidl/

Informative References

[BCP47]
A. Phillips; M. Davis. Tags for Identifying Languages. September 2009. IETF Best Current Practice. URL: https://tools.ietf.org/html/bcp47

IDL Index

[Exposed=Window]
interface FragmentDirective {
};

partial interface Location {
    [SameObject] readonly attribute FragmentDirective fragmentDirective;
};