Scroll To Text Fragment

Draft Community Group Report,

This version:
wicg.github.io/ScrollToTextFragment/draftspec.html
Issue Tracking:
GitHub
Editors:
(Google)
(Google)

Abstract

Scroll To Text adds support for specifying a text snippet in the URL fragment. When navigating to a URL with such a fragment, the browser will find the first instance of the text snippet and scroll it into view.

Status of this document

This specification was published by the Web Platform Incubator Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

1. Introduction

This section is non-normative

1.1. Use cases

1.1.1. Web text references

The core use case for scroll to text is to allow URLs to serve as an exact text reference across the web. For example, Wikipedia references could link to the exact text they are quoting from a page. Similarly, search engines can serve URLs that direct the user to the answer they are looking for in the page rather than linking to the top of the page.

1.1.2. User sharing

With scroll to text, browsers may implement an option to 'Copy URL to here' when the user opens the context menu on a text selection. The browser can then generate a URL with the text selection appropriately specified, and the recipient of the URL will have the text scrolled into view and visually indicated. Without scroll to text, if a user wants to share a passage of text from a page, they would likely just copy and paste the passage, in which case the receiver loses the context of the page.

2. Description

2.1. Syntax

This section is non-normative

A text fragment is specified in the fragment directive (see § 2.2 The Fragment Directive) with the following format:

#:~:text=[prefix-,]textStart[,textEnd][,-suffix]
          context  |-------match-----|  context

(Square brackets indicate an optional parameter)

The text parameters are percent-decoded before matching. Dash (-), ampersand (&), and comma (,) characters in text parameters must be percent-encoded to avoid being interpreted as part of the text directive syntax.

The only required parameter is textStart. If only textStart is specified, the first instance of this exact text string is the target text.

#:~:text=an%20example%20text%20fragment indicates that the exact text "an example text fragment" is the target text.

If the textEnd parameter is also specified, then the text directive refers to a range of text in the page. The target text range is the text range starting at the first instance of startText, until the first instance of endText that appears after startText. This is equivalent to specifying the entire text range in the startText parameter, but allows the URL to avoid being bloated with a long text directive.

#:~:text=an%20example,text%20fragment indicates that the first instance of "an example" until the following first instance of "text fragment" is the target text.

2.1.1. Context Terms

This section is non-normative

The other two optional parameters are context terms. They are specified by the dash (-) character succeeding the prefix and preceding the suffix, to differentiate them from the textStart and textEnd parameters, as any combination of optional parameters may be specified.

Context terms are used to disambiguate the target text fragment. The context terms can specify the text immediately before (prefix) and immediately after (suffix) the text fragment, allowing for whitespace.

While the context terms must be the immediate text surrounding the target text fragment, any amount of whitespace is allowed between context terms and the text fragment. This helps allow context terms to be across element boundaries, for example if the target text fragment is at the beginning of a paragraph and it must be disambiguated by the previous element’s text as a prefix.

The context terms are not part of the target text fragment and must not be visually indicated or affect the scroll position.

#:~:text=this%20is-,an%20example,-text%20fragment would match to "an example" in "this is an example text fragment", but not match to "an example" in "here is an example text".

2.2. The Fragment Directive

To avoid compatibility issues with usage of existing URL fragments, this spec introduces the fragment directive. The fragment directive is a portion of the URL fragment delimited by the code sequence :~:. It is reserved for UA instructions, such as text=, and is stripped from the URL during loading so that author scripts can’t directly interact with it.

The fragment-directive is a mechanism for URLs to specify instructions meant for the UA rather than the document. It’s meant to avoid direct interaction with author script so that future UA instructions can be added without fear of introducing breaking changes to existing content. Potential examples could be: translation-hints or enabling accessibility features.

2.2.1. Parsing the fragment directive

To the definition of a URL record, add:

A URL’s fragment-directive is either null or an ASCII string holding data used by the UA to process the resource. It is initially null

Let the fragment-directive delimiter be the string ":~:", that is the three consecutive code points U+003A (:), U+007E (~), U+003A (:).

The fragment-directive is part of the URL fragment. This means it must always appear after a U+0023 (#) code point in a URL.
To add a fragment-directive to a URL like https://example.com, a fragment must first be appended to the URL: https://example.com#:~:text=foo.

Amend the basic URL parser steps to parse fragment directives in a URL:

These changes make a URL’s fragment end at the fragment directive delimiter. The fragment-directive includes all characters that follow, but not including, the delimiter.
https://example.org/#test:~:text=foo will be parsed such that the fragment is the string "test" and the fragment-directive is the string "text=foo".

2.2.2. Serializing the fragment directive

Amend the URL serializer steps by inserting a step after step 7:

  1. If the exclude fragment flag is unset and url’s fragment-directive is non-null:

    1. If url’s fragment is null, append U+0023 (#) to output.

    2. Append ":~:", followed by url’s fragment-directive, to output.

2.2.3. Processing the fragment directive

To the definition of Document, add:

Each document has an associated fragment directive.

Amend the create and initialize a Document object steps to store and remove the fragment directive from the a Document’s URL.

Replace steps 7 and 8 of this algorithm with:

  1. Let url be null

  2. If request is non-null, then set url to request’s current URL.

  3. Otherwise, set url to response’s URL.

  4. Set document’s fragment-directive be url’s fragment-directive. (Note: this is stored on the document but not web-exposed)

  5. Set url’s fragment-directive to null.

  6. Set the document’s url to be url.

2.3. Security and Privacy

2.3.1. Motivation

This section is non-normative

Care must be taken when implementing text fragment directive so that it cannot be used to exfiltrate information across origins. Scripts can navigate a page to a cross-origin URL with a text fragment directive. If a malicious actor can determine that a victim page scrolled after such a navigation, they can infer the existence of any text on the page.

In addition, the user’s privacy should be ensured even from the destination origin. Although scripts on that page can already learn a lot about a user’s actions, a text fragment directive can still contain sensitive information. For this reason, this specification provides no way for a page to extract the content of the text fragment anchor. User agents must not expose this information to the page.

A user visiting a page listing dozens of medical conditions may have gotten there via a link with a text fragment directive containing a specific condition. This information must not be shared with the page.

2.3.2. Should Allow Text Fragment

This algorithm has input window, is user triggered and returns a boolean indicating whether a text fragment directive should be allowed to invoke.
  1. If any of the following conditions are true, return false.

    • window’s parent field is non-null.

    • window’s opener field is non-null.

    • The document of the previous entry in window’s browsing context’s session history is equal to window’s document.

      That is, this is the result of a same document navigation
    • is user triggered is false.

  2. Otherwise, return true.

The scroll to text specification proposes an amendment to HTML 5 §7.8.9 Navigating to a fragment. In summary, if a text fragment directive is present and a match is found in the page, the text fragment takes precedent over the element fragment as the indicated part of the document.

Add the following steps to the beginning of the processing model for The indicated part of the document.

  1. Let fragment directive be the document URL’s fragment directive.

  2. Let is user activated be true if the current navigation was triggered by user activation

    TODO: This might need an additional flag somewhere to track the user activation triggering
  3. If the result of § 2.3.2 Should Allow Text Fragment with the window of the document’s browsing context and is user activated is true then:

    1. If § 2.4.1 Find a target text with fragment directive returns non-null, then the return value is the indicated part of the document; return.

2.4.1. Find a target text

To find the target text for a given string fragment directive, the user agent must run these steps:

  1. If fragment directive does not begin with the string "text=", then return null.

  2. Let raw target text be the substring of fragment directive starting at index 5.

    This is the remainder of the fragment directive following, but not including, the "text=" prefix.
  3. If raw target text is the empty string, return null.

  4. Let tokens be a list of strings that is the result of splitting the string raw target text on commas.

  5. Let prefix and suffix and textEnd be the empty string.

    prefix, suffix, and textEnd are the optional parameters of the text directive.
  6. Let potential prefix be the first item of tokens.

  7. If the last character of potential prefix is U+002D (-), then:

    1. Set prefix to the result of removing any U+002D (-) from potential prefix.

    2. Remove the first item of the list tokens.

  8. Let potential suffix be the last item of tokens.

  9. If the first character of potential suffix is U+002D (-), then:

    1. Set suffix to the result of removing any U+002D (-) from potential suffix.

    2. Remove the last item of the list tokens.

  10. Assert: tokens has size 1 or tokens has size 2.

    Once the prefix and suffix are removed from tokens, tokens may either contain one item (textStart) or two items (textStart and textEnd).
  11. Let textStart be the first item of tokens.

  12. If tokens has size 2, then let textEnd be the last item of tokens.

    The strings prefix, textStart, textEnd, and suffix now contain the text directive parameters as defined in § 2.1 Syntax.
  13. Let walker be a TreeWalker equal to Document.createTreeWalker().

  14. Let position be a position variable that indicates a text offset in in walker.currentNode.innerText.

  15. If textEnd is the empty string, then:

    1. Let match position be the result of § 2.4.2 Find an exact match with context with input walker walker, search position position, prefix prefix, query textStart, and suffix suffix.

    2. If match position is null, then return null.

    3. Let match be a Range in walker.currentNode with position match position and length equal to the length of textStart.

    4. Return match.

  16. Otherwise, let potential start position be the result of § 2.4.2 Find an exact match with context with input walker walker, start position position, prefix prefix, query textStart, and suffix null.

  17. If potential start position is null, then return null.

  18. Let end position be the result of § 2.4.2 Find an exact match with context with input walker walker, search position potential start position, prefix null, query textEnd, and suffix suffix.

  19. If end position is null, then return null.

  20. Advance end position by the length of textEnd.

  21. Let match be a Range in walker.currentNode with start position potential start position and length equal to end position - start position.

  22. Return match.

2.4.2. Find an exact match with context

This algorithm has input walker, search position, prefix, query, and suffix and returns a text position that is the start of the match.
The input walker is a TreeWalker reference, not a copy, i.e. any modifications are performed on the caller’s instance of walker.
  1. While walker.currentNode is not null:

    1. Assert: walker.currentNode is a text node.

    2. Let text be equal to walker.currentNode.innerText.

    3. While search position does not point past the end of text:

      1. If prefix is not the empty string, then:

        1. Advance search position to the position after the result of § 2.4.4 Find the next word bounded instance of prefix in text from search position with current locale.

        2. If search position is null, then break.

        3. Advance search position past any whitespace.

        4. If search position is at the end of text, then:

          1. Perform § 2.4.3 Advance a TreeWalker to the next text node on walker.

          2. If walker.currentNode is null, then return null.

          3. Set text to walker.currentNode.innerText.

          4. Set search position to the beginning of text.

          5. Advance search position past any whitespace.

        5. If the result of § 2.4.4 Find the next word bounded instance of query in text from search position with current locale does not start at search position, then continue.

      2. Advance search position to the position after the result of § 2.4.4 Find the next word bounded instance of query in text from search position with current locale.

        If a prefix was specified, the search position is at the beginning of query and this will advance it to the end of the query to search for a potential suffix. Otherwise, this will find the next instance of query.
      3. If search position is null, then break.

      4. Let potential match position be a position variable equal to search position minus the length of query.

      5. If suffix is the empty string, then return potential match position.

      6. Advance search position past any whitespace.

      7. If search position is at the end of text, then:

        1. Let suffix_walker be a TreeWalker that is a copy of walker.

        2. Perform § 2.4.3 Advance a TreeWalker to the next text node on suffix_walker.

        3. If suffix_walker.currentNode is null, then return null.

        4. Set text to suffix_walker.currentNode.innerText.

        5. Set search position to the beginning of text.

        6. Advance search position past any whitespace.

      8. If the result of § 2.4.4 Find the next word bounded instance of suffix in text from search position with current locale starts at search position, then return potential match position.

    4. Perform § 2.4.3 Advance a TreeWalker to the next text node on walker.

  2. Return null.

The current locale is the language of the currentNode.

2.4.3. Advance a TreeWalker to the next text node

The input walker is a TreeWalker reference, not a copy, i.e. any modifications are performed on the caller’s instance of walker.
  1. While the input walker.currentNode is not null and walker.currentNode is not a text node:

    1. Advance the current node by calling walker.nextNode()

2.4.4. Find the next word bounded instance

This algorithm has input query, text, start position, and locale and returns a Range that specifies the word bounded text instance if it is found.
See Intl.Segmenter, a proposal to specify unicode segmentation, including word segmentation. Once specified, this algorithm may be improved by making use of the Intl.Segmenter API for word boundary matching.
  1. While start position does not point past the end of text:

    1. Advance start position to the next instance of query in text.

    2. Let range be a Range with position start position and length equal to the length of query.

    3. Using locale locale, let left bound be the last word boundary in text before range.

    4. Using locale locale, let right bound be the first word boundary in text after range.

      Limiting matching to word boundaries is one of the mitigations to limit cross-origin information leakage. A word boundary is as defined in the Unicode text segmentation annex. The Default Word Boundary Specification defines a default set of what constitutes a word boundary, but as the specification mentions, a more sophisticated algorithm should be used based on the locale.

      Dictionary-based word bounding should take specific care in locales without a word-separating character (e.g. space). In those cases, and where the alphabet contains fewer than 100 characters, the dictionary must not contain more than 20% of the alphabet as valid, one-letter words.

    5. If left bound immediately precedes range and right bound immediately follows range, then return range.

  2. Return null.

2.5. Indicating The Text Match

In addition to scrolling the text fragment into view as part of the Try To Scroll To The Fragment steps, the UA should visually indicate the matched text in some way such that the user is made aware of the text match.

The UA should provide to the user some method of dismissing the match, such that the matched text no longer appears visually indicated.

The exact appearance and mechanics of the indication are left as UA-defined. However, the UA must not use the Document’s selection to indicate the text match as doing so could allow attack vectors for content exfiltration.

The UA must not visually indicate any provided context terms.

2.6. Feature Detectability

For feature detectability, we propose adding a new FragmentDirective interface that is exposed via window.location.fragmentDirective if the UA supports the feature.

interface FragmentDirective {
};

We amend The Location Interface to include a fragmentDirective property:

interface Location {
    readonly attribute FragmentDirective fragmentDirective;
};

3. Generating Text Fragment Directives

This section is non-normative.

This section contains recommendations for UAs automatically generating URLs with text fragment directives. These recommendations aren’t normative but are provided to ensure generated URLs result in maximally stable and usable URLs.

3.1. Prefer Exact Matching To Range-based

The match text can be provided either as an exact string "text=foo%20bar%20baz" or as a range "text=foo,bar".

UAs should prefer to specify the entire string where practical. This ensures that if the destination page is removed or changed, the intended destination can still be derived from the URL itself.

Suppose we wish to craft a URL to https://en.wikipedia.org/wiki/History_of_computing quoting the sentence:
The first recorded idea of using digital electronics for computing was the
1931 paper "The Use of Thyratrons for High Speed Automatic Counting of
Physical Phenomena" by C. E. Wynn-Williams.

We could create a range-based match like so:

https://en.wikipedia.org/wiki/History_of_computing#:~:text=The%20first%20recorded,Williams

Or we could encode the entire sentence using an exact match term:

https://en.wikipedia.org/wiki/History_of_computing#:~:text=The%20first%20recorded%20idea%20of%20using%20digital%20electronics%20for%20computing%20was%20the%201931%20paper%20%22The%20Use%20of%20Thyratrons%20for%20High%20Speed%20Automatic%20Counting%20of%20Physical%20Phenomena%22%20by%20C.%20E.%20Wynn-Williams

The range-based match is less stable, meaning that if the page is changed to include another instance of "The first recorded" somewhere earlier in the page, the link will now target an unintended text snippet.

The range-based match is also less useful semantically. If the page is changed to remove the sentence, the user won’t know what the intended target was. In the exact match case, the user can read, or the UA can surface, the text that was being searched for but not found.

Range-based matches can be helpful when the quoted text is excessively long and encoding the entire string would produce an unwieldly URL.

It is recommended that text snippets shorter than 300 characters always be encoded using an exact match. Above this limit, the UA should encode the string as a range-based match.

TODO: Can we determine the above limit in some more objective way?

3.2. Use Context Only When Necessary

Context terms allow the text fragment directive to disambiguate text snippets on a page. However, their use can make the URL more brittle in some cases. Often, the desired string will start or end at an element boundary. The context will therefore exist in an adjacent element. Changes to the page structure could invalidate the text fragment directive since the context and match text may no longer appear to be adjacent.

Suppose we wish to craft a URL for the following text:
<div class="section">HEADER</div>
<div class="content">Text to quote</div>

We could craft the text fragment directive as follows:

text=HEADER-,Text%20to%20quote

However, suppose the page changes to add a "[edit]" link beside all section headers. This would now break the URL.

Where a text snippet is long enough and unique, a UA should prefer to avoid adding superfluous context terms.

It is recommended that context should be used only if one of the following is true:

TODO: Determine the numeric limit above in a more objective way

3.3. Determine If Fragment Id Is Needed

When the UA navigates to a URL containing a text fragment directive, it will fallback to scrolling into view a regular element-id based fragment if it exists and the text fragment isn’t found.

This can be useful to provide a fallback, in case the text in the document changes, invalidating the text fragment directive.

Suppose we wish to craft a URL to https://en.wikipedia.org/wiki/History_of_computing quoting the sentence:
The earliest known tool for use in computation is the Sumerian abacus

By specifying the section that the text appears in, we ensure that, if the text is changed or removed, the user will still be pointed to the relevant section:

https://en.wikipedia.org/wiki/History_of_computing#Early_computation:~:text=The%20earliest%20known%20tool%20for%20use%20in%20computation%20is%20the%20Sumerian%20abacus

However, UAs should take care that the fallback element-id fragment is the correct one:

Suppose the user navigates to https://en.wikipedia.org/wiki/History_of_computing#Early_computation. They now scroll down to the Symbolic Computations section. There, they select a text snippet and choose to create a URL to it:
By the late 1960s, computer systems could perform symbolic algebraic
manipulations

The UA should note that, even though the current URL of the page is: https://en.wikipedia.org/wiki/History_of_computing#Early_computation, using #Early_computation as a fallback is inappropriate. If the above sentence is changed or removed, the page will load in the #Early_computation section which could be quite confusing to the user.

If the UA cannot reliably determine an appropriate fragment to fallback to, it should remove the fragment id from the URL:

https://en.wikipedia.org/wiki/History_of_computing#:~:text=By%20the%20late%201960s,%20computer%20systems%20could%20perform%20symbolic%20algebraic%20manipulations

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

References

Normative References

[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119

IDL Index

interface FragmentDirective {
};

interface Location {
    readonly attribute FragmentDirective fragmentDirective;
};