User-Agent Client Hints

Draft Community Group Report,

This version:
https://wicg.github.io/ua-client-hints/
Editors:
(Google Inc.)
(Google Inc.)
Participate:
File an issue (open issues)

Abstract

This document defines a set of Client Hints that aim to provide developers with the ability to perform agent-based content negotiation when necessary, while avoiding the historical baggage and passive fingerprinting surface exposed by the venerable `User-Agent` header.

Status of this document

This specification was published by the Web Platform Incubator Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

1. Introduction

Today, user agents generally identify themselves to servers by sending a User-Agent HTTP request header field along with each request (defined in Section 5.5.3 of [RFC7231]). Ideally, this header would give servers the ability to perform content negotiation, sending down exactly those bits that best represent the requested resource in a given user agent, optimizing both bandwidth and user experience. In practice, however, this header’s value exposes far more information about the user’s device than seems appropriate as a default, on the one hand, and intentionally obscures the true user agent in order to bypass misguided server-side heuristics, on the other.

For example, a recent version of Chrome on iOS identifies itself as:

  User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 12_0 like Mac OS X)
              AppleWebKit/605.1.15 (KHTML, like Gecko)
              CriOS/69.0.3497.105 Mobile/15E148 Safari/605.1

While a recent version of Edge identifies itself as:

  User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)
              AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.2704.79
              Safari/537.36 Edge/18.014

There’s quite a bit of information packed into those strings (along with a fair number of lies). Version numbers, platform details, model information, etc. are all broadcast along with every request, and form the basis for fingerprinting schemes of all sorts. Individual vendors have taken stabs at altering their user agent strings, and have run into a few categories of feedback from developers that have stymied historical approaches:

  1. Brand and version information (e.g. "Chrome 69") allows websites to work around known bugs in specific releases that aren’t otherwise detectable. For example, implementations of Content Security Policy have varied wildly between vendors, and it’s difficult to know what policy to send in an HTTP response without knowing what browser is responsible for its parsing and execution.

  2. Developers will often negotiate what content to send based on the user agent and platform. Some application frameworks, for instance, will style an application on iOS differently from the same application on Android in order to match each platform’s aesthetic and design patterns.

  3. Similarly to #1, OS revisions and architecture can be responsible for specific bugs which can be worked around in website’s code, and narrowly useful for things like selecting appropriate executables for download (32 vs 64 bit, ARM vs Intel, etc).

  4. Sophisticated developers use model/make to tailor their sites to the capabilities of the device (e.g. [FacebookYearClass]) and to pinpoint performance bugs and regressions which sometimes are specific to model/make.

This document proposes a mechanism which might allow user agents to be a bit more aggressive about removing entropy from the User-Agent string generally by giving servers that really need some specific details about the client the ability to opt-into receiving them. It introduces four new Client Hints ([I-D.ietf-httpbis-client-hints]) that can provide the client’s branding and version information, the underlying operating system’s branding and major version, as well as details about the underlying device. Rather than broadcasting this data to everyone, all the time, user agents can make reasonable decisions about how to respond to given sites' requests for more granular data, reducing the passive fingerprinting surface area exposed to the network.

1.1. Examples

A user navigates to https://example.com/ for the first time. Their user agent sends the following header along with the HTTP request:

  Sec-CH-UA: "Examplary Browser"; v="73"

The server is interested in rendering content consistent with the user’s underlying platform, and asks for a little more information by sending an Accept-CH header (Section 2.2.1 of [I-D.ietf-httpbis-client-hints]) along with the initial response:

  Accept-CH: UA-Full-Version, UA-Platform

In response, the user agent includes more detailed version information, as well as information about the underlying platform in the next request:

  Sec-CH-UA: "Examplary Browser"; v="73"
  Sec-CH-UA-Full-Version: "73.3R8.2H.1"
  Sec-CH-UA-Platform: "Windows"

2. User Agent Hints

The following sections define a number of HTTP request header fields that expose detail about a given user agent, which servers can opt-into receiving via the Client Hints infrastructure defined in [I-D.ietf-httpbis-client-hints]. The definitions below assume that each user agent has defined a number of properties for itself:

User agents SHOULD keep these strings short and to the point, but servers MUST accept arbitrary values for each, as they are all values constructed at the user agent's whim.

User agents MUST map higher-entropy platform architecture values to the following buckets:

Other CPU architectures could be mapped into one of these values in case that makes sense, or be mapped to the empty string.

ISSUE 105: There might be use-cases for higher-entropy, more specific CPU architectures (e.g. 32 vs. 64 bit architectures, or specific instruction sets for the download of highly optimized executable binaries). If necessary, we could support those use-cases through one or more separate hints.

User agents SHOULD return the empty string or a fictitious value for platform architecture unless the user’s platform is one where both the following conditions apply:

User Agents MUST return the empty string for model if mobileness is false. User Agents MUST return the empty string for model even if mobileness is true, except on platforms where the model is typically exposed.

User agents MAY return the empty string or a fictitious value for full version, platform architecture or model, for privacy, compatibility, or other reasons.

2.1. The 'Sec-CH-UA-Arch' Header Field

The Sec-CH-UA-Arch request header field gives a server information about the architecture of the platform on which a given user agent is executing. It is a Structured Header whose value MUST be a string [I-D.ietf-httpbis-header-structure].

The header’s ABNF is:

  Sec-CH-UA-Arch = sh-string

2.2. The 'Sec-CH-UA-Model' Header Field

The Sec-CH-UA-Model request header field gives a server information about the device on which a given user agent is executing. It is a Structured Header whose value MUST be a string [I-D.ietf-httpbis-header-structure].

The header’s ABNF is:

  Sec-CH-UA-Model = sh-string

Perhaps Sec-CH-UA-Mobile is enough, and we don’t need to expose the model?

2.3. The 'Sec-CH-UA-Platform' Header Field

The Sec-CH-UA-Platform request header field gives a server information about the platform on which a given user agent is executing. It is a Structured Header whose value MUST be a string [I-D.ietf-httpbis-header-structure].

The header’s ABNF is:

  Sec-CH-UA-Platform = sh-string

2.4. The 'Sec-CH-UA-Platform-Version' Header Field

The Sec-CH-UA-Platform-Version request header field gives a server information about the platform version on which a given user agent is executing. It is a Structured Header whose value MUST be a string [I-D.ietf-httpbis-header-structure].

The header’s ABNF is:

  Sec-CH-UA-Platform-Version = sh-string

2.5. The 'Sec-CH-UA' Header Field

The Sec-CH-UA request header field gives a server information about a user agent's branding and version. It is a Structured Header whose value MUST be a list [I-D.ietf-httpbis-header-structure]. The list’s items MUST be string. The value of each item SHOULD include a "v" parameter, indicating the user agent's version.

The header’s ABNF is:

  Sec-CH-UA = sh-list

Unlike most Client Hints, since it’s included in the low-entropy table, the Sec-CH-UA header will be sent with all requests, whether or not the server opted-into receiving the header via an Accept-CH header. Therefore, it includes only the user agent's branding information, and the significant version number (both of which are fairly clearly sniffable by "examining the structure of other headers and by testing for the availability and semantics of the features introduced or modified between releases of a particular browser" [Janc2014]).

To return the Sec-CH-UA value for a request, user agents MUST:

  1. Let list be a list, initially empty.

  2. For each brandVersion in brands:

    1. Let parameter be a dictionary, initially empty.

    2. Set parameter["param_name"] to "v".

    3. Set parameter["param_value"] to brandVersion’s version.

    4. Let pair be a tuple comprised of brandVersion’s brand and parameter.

    5. Append pair to list.

  3. Return the output of running serializing a list with list as input.

2.6. The 'Sec-CH-UA-Full-Version' Header Field

The Sec-CH-UA-Full-Version request header field gives a server information about the user agent’s full version. It is a Structured Header whose value MUST be a string [I-D.ietf-httpbis-header-structure].

The header’s ABNF is:

  Sec-CH-UA-Full-Version = sh-string

2.7. The 'Sec-CH-UA-Mobile' Header Field

The Sec-CH-UA-Mobile request header field gives a server information about whether or not a user agent prefers a "mobile" user experience. It is a Structured Header whose value MUST be a boolean [I-D.ietf-httpbis-header-structure].

The header’s ABNF is:

  Sec-CH-UA-Mobile = sh-boolean

2.8. Integration with Fetch

Fetch integration of this specification is defined as part of the Client Hints infrastructure specification.

3. Interface

dictionary NavigatorUABrandVersion {
  DOMString brand;
  DOMString version;
};

dictionary UADataValues {
  DOMString platform; 
  DOMString platformVersion;
  DOMString architecture;
  DOMString model;
  DOMString uaFullVersion;
};

[Exposed=(Window,Worker)]
interface NavigatorUAData {
  readonly attribute FrozenArray<NavigatorUABrandVersion> brands;
  readonly attribute boolean mobile;
  Promise<UADataValues> getHighEntropyValues(sequence<DOMString> hints);
};

interface mixin NavigatorUA {
  [SecureContext] readonly attribute NavigatorUAData userAgentData;
};

Navigator includes NavigatorUA;
WorkerNavigator includes NavigatorUA;

Note: The high-entropy portions of the user agent information are retrieved through a Promise, in order to give user agents the opportunity to gate their exposure behind potentially time-consuming checks (e.g. by asking the user for their permission).

3.1. Processing model

3.2. WindowOrWorkerGlobalScope

Each user agent has an associated brands, which is a list created by running create brands.

Every WindowOrWorkerGlobalScope object has an associated brands frozen array, which is a FrozenArray<NavigatorUABrandVersion>. It is initially the result of creating a frozen array from the user agent's brands.

3.3. Create brands

When asked to run the create brands algorithm, the user agent MUST run the following steps:

  1. Let list be a list.

  2. Collect pairs of brand and significant version which represent the user agent, its equivalence class and/or its rendering engine.

  3. For each pair:

    1. Let dict be a new NavigatorUABrandVersion dictionary, with brand as brand and significant version as version.

    2. Append dict to list.

  4. The user agent SHOULD execute the following steps:

    1. Append additional items to list containing NavigatorUABrandVersion objects, initialized with arbitrary brand and version combinations.

    2. Randomize the order of the items in list.

    Note: See § 5.2 GREASE-like UA Strings for more details on why these steps might be appropriate.

  5. Return list.

3.4. Getters

On getting, the brands attribute MUST return this's relevant global object's brands frozen array.

On getting, the mobile attribute must return the user agent's mobileness.

3.5. getHighEntropyValues method

The getHighEntropyValues(hints) method MUST run these steps:

  1. Let p be a a new promise created in the current realm.

  2. Run the following steps in parallel:

    1. Let uaData be a new UADataValues.

    2. If hints contains "platform", set uaData["platform"] to the user agent's platform brand.

    3. If hints contains "platformVersion", set uaData["platformVersion"] to the user agent's platform version.

    4. If hints contains "architecture", set uaData["architecture"] to the user agent's platform architecture.

    5. If hints contains "model", set uaData["model"] to the user agent's model.

    6. If hints contains "uaFullVersion", let uaData["uaFullVersion"] be the the user agent’s full version.

    7. Queue a task on the permission task source to resolve p with uaData.

  3. Return p.

4. Security and Privacy Considerations

4.1. Secure Transport

Client Hints will not be delivered to non-secure endpoints (see the secure transport requirements in Section 2.2.1 of [I-D.ietf-httpbis-client-hints]). This means that user agent information will not be leaked over plaintext channels, reducing the opportunity for network attackers to build a profile of a given agent’s behavior over time.

4.2. Delegation

Client Hints will be delegated from top-level pages via Feature Policy. This reduces the likelihood that user agent information will be delivered along with subresource requests, which reduces the potential for passive fingerprinting.

That delegation is defined as part of append client hints to request.

4.3. Access Restrictions

The information in the Client Hints defined above reveals quite a bit of information about the user agent and the platform/device upon which it runs. User agents ought to exercise judgement before granting access to this information, and MAY impose restrictions above and beyond the secure transport and delegation requirements noted above. For instance, user agents could choose to reveal platform architecture only on requests it intends to download, giving the server the opportunity to serve the right binary. Likewise, they could offer users control over the values revealed to servers, or gate access on explicit user interaction via a permission prompt or via a settings interface.

5. Implementation Considerations

5.1. The 'User-Agent' Header

User agents SHOULD deprecate the User-Agent header in favor of the Client Hints model described in this document. The header, however, is likely to be impossible to remove entirely in the near-term, as existing sites' content negotiation code will continue to require its presence (see [Rossi2015] for a recent example of a new browser’s struggles in this area).

One approach which might be advisable could be for each user agent to lock the value of its User-Agent header, ensuring backwards compatibility by maintaining the crufty declarations of "like Gecko" and "AppleWebKit/537.36" on into eternity. This can ratchet over time, first freezing the version number, then shifting platform and model information to something reasonably generic in order to reduce the fingerprint the header provides.

5.2. GREASE-like UA Strings

History has shown us that there are real incentives for user agents to lie about their branding in order to thread the needle of sites' sniffing scripts, and prevent their users from being blocked by UA-based allow/block lists.

Reseting expectations may help to prevent abuse of the UA string’s brand in the short term, but probably won’t help in the long run. The world of network protocols introduced the notion of GREASE [I-D.ietf-tls-grease]. We could borrow from that concept to tackle this problem.

User agents' brands containing more than a single entry could encourage standardized processing of the UA string. By randomly including additional, intentionally incorrect, comma-separated entries with arbitrary ordering, they would reduce the chance that we ossify on a few required strings.

Let’s examine a few examples:

User agents MUST include more than a single value in brands, where at least one of these values is an arbitrary value.

When adding arbitrary values to brands, user agents MUST make sure that receivers of the header adhere to Structured Header parsing, by adding escaped double-quotes, commas and semi-colons to those values. The purpose of this is to make non-compliant server implementations immediately aware that their parsing code is inadequate.

The value order in brands MUST change over time, the prevent receivers of the header from relying on certain values beeing in certain locations in the string.

When choosing GREASE strategies, user agents SHOULD keep caching variance in mind and minimize variance among identical user agent versions.

Note: One approach to minimize caching variance could be to determine the GREASE parts of the UA set at build time, and keep them identical throughout the lifetime of the user agent's significant version.

5.3. The 'Sec-CH-' prefix

Restricting user-land JavaScript code from influencing and modifying UA-CH headers has various security related advantages. At the same time, there don’t seem to be any legitimate use-cases which require such user-land rewriting.

As such and based on discussions with the TAG, it seems reasonable to forbid write access to these headers from JavaScript (e.g. through fetch or Service Workers), and demarcate them as browser-controlled client hints so they can be documented and included in requests without triggering CORS preflights.

Therefore, request headers defined in this specification include a Sec-CH- prefix.

6. IANA Considerations

This document intends to define the Sec-CH-UA-Arch, Sec-CH-UA-Model, Sec-CH-UA-Platform, Sec-CH-UA-Platform-Version, Sec-CH-UA-Mobile and Sec-CH-UA HTTP request header fields, and register them in the permanent message header field registry ([RFC3864]).

It also intends to deprecate the User-Agent header field.

6.1. 'Sec-CH-UA-Arch' Header Field

Header field name:

Sec-CH-UA-Arch

Applicable protocol:

http

Status:

standard

Author/Change controller:

IETF

Specification document:

this specification (§ 2.1 The 'Sec-CH-UA-Arch' Header Field)

6.2. 'Sec-CH-UA-Model' Header Field

Header field name:

Sec-CH-UA-Model

Applicable protocol:

http

Status:

standard

Author/Change controller:

IETF

Specification document:

this specification (§ 2.5 The 'Sec-CH-UA' Header Field)

6.3. 'Sec-CH-UA-Platform' Header Field

Header field name:

Sec-CH-UA-Platform

Applicable protocol:

http

Status:

standard

Author/Change controller:

IETF

Specification document:

this specification (§ 2.3 The 'Sec-CH-UA-Platform' Header Field)

6.4. 'Sec-CH-UA-Platform-Version' Header Field

Header field name:

Sec-CH-UA-Platform-Version

Applicable protocol:

http

Status:

standard

Author/Change controller:

IETF

Specification document:

this specification (§ 2.3 The 'Sec-CH-UA-Platform' Header Field)

6.5. 'Sec-CH-UA' Header Field

Header field name:

Sec-CH-UA

Applicable protocol:

http

Status:

standard

Author/Change controller:

IETF

Specification document:

this specification (§ 2.5 The 'Sec-CH-UA' Header Field)

6.6. 'Sec-CH-UA-Mobile' Header Field

Header field name:

Sec-CH-UA-Mobile

Applicable protocol:

http

Status:

standard

Author/Change controller:

IETF

Specification document:

this specification (§ 2.7 The 'Sec-CH-UA-Mobile' Header Field)

6.7. 'Sec-CH-UA-Full-Version' Header Field

Header field name:

Sec-CH-UA-Full-Version

Applicable protocol:

http

Status:

standard

Author/Change controller:

IETF

Specification document:

this specification (§ 2.6 The 'Sec-CH-UA-Full-Version' Header Field)

6.8. 'User-Agent' Header Field

Header field name:

User-Agent

Applicable protocol:

http

Status:

deprecated

Author/Change controller:

IETF

Specification document:

this specification (§ 5.1 The 'User-Agent' Header), and Section 5.5.3 of [RFC7231]

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[I-D.ietf-httpbis-header-structure]
Mark Nottingham; Poul-Henning Kamp. Structured Headers for HTTP. ID. URL: https://tools.ietf.org/html/draft-ietf-httpbis-header-structure
[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[PERMISSIONS]
Mounir Lamouri; Marcos Caceres; Jeffrey Yasskin. Permissions. URL: https://w3c.github.io/permissions/
[WebIDL]
Boris Zbarsky. Web IDL. URL: https://heycam.github.io/webidl/

Informative References

[FacebookYearClass]
Chris Marra; Daniel Weaver. Year class: A classification system for Android. URL: https://engineering.fb.com/android/year-class-a-classification-system-for-android/
[I-D.ietf-httpbis-client-hints]
Ilya Grigorik. HTTP Client Hints. ID. URL: https://tools.ietf.org/html/draft-ietf-httpbis-client-hints
[I-D.ietf-tls-grease]
David Benjamin. Applying GREASE to TLS Extensibility. ID. URL: https://tools.ietf.org/html/draft-ietf-tls-grease
[Janc2014]
Artur Janc; Michal Zalweski. Technical analysis of client identification mechanisms. URL: https://dev.chromium.org/Home/chromium-security/client-identification-mechanisms#TOC-Browser-level-fingerprints
[RFC3864]
G. Klyne; M. Nottingham; J. Mogul. Registration Procedures for Message Header Fields. September 2004. Best Current Practice. URL: https://tools.ietf.org/html/rfc3864
[RFC7231]
R. Fielding, Ed.; J. Reschke, Ed.. Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content. June 2014. Proposed Standard. URL: https://httpwg.org/specs/rfc7231.html
[Rossi2015]
The Microsoft Edge Rendering Engine that makes the Web just work. URL: https://channel9.msdn.com/Events/WebPlatformSummit/2015/The-Microsoft-Edge-Rendering-Engine-that-makes-the-Web-just-work#time=9m45s

IDL Index

dictionary NavigatorUABrandVersion {
  DOMString brand;
  DOMString version;
};

dictionary UADataValues {
  DOMString platform; 
  DOMString platformVersion;
  DOMString architecture;
  DOMString model;
  DOMString uaFullVersion;
};

[Exposed=(Window,Worker)]
interface NavigatorUAData {
  readonly attribute FrozenArray<NavigatorUABrandVersion> brands;
  readonly attribute boolean mobile;
  Promise<UADataValues> getHighEntropyValues(sequence<DOMString> hints);
};

interface mixin NavigatorUA {
  [SecureContext] readonly attribute NavigatorUAData userAgentData;
};

Navigator includes NavigatorUA;
WorkerNavigator includes NavigatorUA;


Issues Index

Perhaps Sec-CH-UA-Mobile is enough, and we don’t need to expose the model?