This document defines a mechanism that enables developers to declare a network error reporting policy for a web application. A user agent can use this policy to report encountered network errors that prevented it from successfully fetching requested resources.

This is a proposal and may change without any notices. Interested parties should bring discussions to the Web Platform Incubator Community Group.

Introduction

Accurately measuring performance characteristics of web applications is an important aspect in helping site developers understand how to improve their web applications. The worst case scenario is the failure to load the application, or a particular resource, due to a network error, and to address such failures the developer requires assistance from the user agent to identify when, where, and why such failures are occurring.

Today, application developers do not have real-time web application availability data from their end users. For example, if the user fails to load the page due to a network error, such as a failed DNS lookup, a connection timeout, a reset connection, or other reasons, the site developer is unable to detect and address this issue. Existing methods, such as synthetic monitoring provide a partial solution by placing monitoring nodes in predetermined geographic locations, but require additional infrastructure investments, and cannot provide truly global and near real-time availability data for real end users.

Network Error Logging (NEL) addresses this need by defining a mechanism enabling web applications to declare a reporting policy that can be used by the user agent to report network errors for a given origin. To take advantage of NEL, a web application opts into using NEL by supplying a NEL HTTP response header field that describes the reporting policy. Then, if the NEL policy is available for a given origin, and an end user fails to successfully fetch a resource from that origin, the user agent logs the network error report and attempts to deliver it to a group of endpoints previously configured using the Reporting API [[!REPORTING]].

For example, if the user agent fails to fetch a resource from https://www.example.com due to an aborted TCP connection, the user agent would queue the following report via the Reporting API:

type
"network-error"
endpoint group
the endpoint group configured by the report-to field
settings
TODO
data
{
  "uri": "https://www.example.com/resource",
  "referrer": "https://referrer.com/",
  "server-ip": "123.122.121.120",
  "elapsed-time": 321,
  "type": "tcp.aborted"
}
      

See reporting for explanation of the communicated fields and format of the report, and examples for more hands-on examples of NEL registration and reporting process.

Conformance requirements

All diagrams, examples, and notes in this specification are non-normative, as are all sections explicitly marked non-normative. Everything else in this specification is normative.

The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative parts of this document are to be interpreted as described in [[!RFC2119]].

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

Some conformance requirements are phrased as requirements on attributes, methods or objects. Such requirements are to be interpreted as requirements on the user agent.

Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)

Dependencies

Fetch

The following terms are defined in the Fetch specification: [[!FETCH]]

  • client
HTML

The following terms are defined in the HTML specification: [[!HTML]]

  • navigator.onLine
  • resource origin
HTTP

The following terms are defined in the HTTP specification: [[!RFC7231]]

  • 5xx status code
  • resource representation
  • status code
HTTP JSON field values

The following terms are defined in the HTTP-JFV specification: [[!HTTP-JFV]]

  • json-field-value
JSON

The following terms are defined in the JSON specification: [[!RFC7159]]

  • JSON object
Referrer Policy

The following terms are defined in the Referrer Policy specification: [[!REFERRER-POLICY]]

  • referrer policy
Reporting API

The following terms are defined in the Reporting API specification: [[!REPORTING]]

  • endpoint group
Resource Timing

The following terms are defined in the Resource Timing specification: [[!RESOURCE-TIMING-2]]

  • network protocol
Secure Contexts

The following terms are defined in the Secure Contexts specification: [[!SECURE-CONTEXTS]]

  • the is-origin-trustworthy algorithm
  • potentially trustworthy origin
URL

The following terms are defined in the URL specification: [[!URL]]

  • fragment
  • host
  • URL

Network Error Logging

Policy Delivery and Processing

The server delivers the NEL policy to the user agent via an HTTP response header field (NEL header field). If the result of executing the is-origin-trustworthy algorithm on the origin that served the NEL policy is Potentially Trustworthy then the user agent MUST either:

Otherwise, if the result of the algorithm is not Potentionally Trustworthy, then the user MUST ignore the provided NEL policy.

`NEL` Header Field

The NEL header field is used to communicate the NEL policy to the user agent. The ABNF (Augmented Backus-Naur Form) syntax for the NEL header field is as follows:

NEL = json-field-value

The header's value is interpreted as an array of JSON objects, as defined by json-field-value. Each object in the array defines an NEL policy for the origin. The user agent MUST process the first valid policy in the array.

User agents MUST ignore any unknown or invalid field(s) or value(s) that do not conform to the syntax defined in this specification. A valid NEL header field MUST, at a minimum, contain one object with all of the "REQUIRED" fields defined in this specification.

The user agent MUST ignore the NEL header specified via a meta element to mitigate hijacking of error reporting via scripting attacks. The NEL policy MUST be delivered via the NEL header field.

The restriction on meta element is consistent with the [[CSP]] specification, which restricts reporting registration to HTTP header fields only for the same reasons.

The report-to Field

The report-to field specifies the endpoint group to which the user agent sends reports about network errors. The report-to field is a REQUIRED field to register an NEL policy, and OPTIONAL if the intent is to remove a previous registration - see max-age. The value of the field MUST be a string containing the endpoint group to which reports will be sent.

To improve delivery of NEL reports, the application should set report-to to an endpoint group containing at least one endpoint in an alternative origin whose infrastructure is not coupled with the origin from which the resource is being fetched — otherwise network errors cannot be reported until the problem is solved, if ever — and provide multiple endpoints to provide alternatives if some endpoints are unreachable.

The max-age Field

The REQUIRED max-age field specifies the number of seconds, after the reception of the NEL header field, during which the user agent regards the host (from whom the policy was received) as a known NEL origin. The value of the field MUST be an non-negative integer.

A max-age value of zero (i.e. '"max-age": 0') signals the user agent to cease regarding the host as a known NEL origin, including the include-subdomains field if provided.

To ensure delivery of NEL reports, the application should ensure that the Reporting API is also configured with a sufficiently high max-age. If the Reporting policy expires, NEL reports will not be delivered, even if the NEL policy has not expired.

The include-subdomains Field

The OPTIONAL include-subdomains field, if present and true, signals the user agent that the NEL policy applies not only to the origin that served the resource representation, but also to any origin whose host component is a subdomain of the host component of the resource representation’s origin. If present, the value of the field MUST be a boolean value.

To ensure delivery of NEL reports for subdomains, the application should ensure that the Reporting API is also configured with include-subdomains enabled. If the Reporting policy is not, and there is not a separate Reporting policy for a given subdomain, NEL reports for that subdomain will not be delivered, even if the NEL policy includes the subdomain.

Policy Storage and Maintenance

An HTTP host declares itself an NEL origin by issuing an NEL policy, which is communicated via the NEL header field from a potentially trustworthy origin. Upon error-free receipt and processing of this header by a conformant user agent, the user agent regards the host as a known NEL origin.

The user agent MUST maintain the NEL policy of any given NEL origin separately from any NEL policies issued by any other NEL origins. Only the given NEL origin can update or cause deletion of its NEL policy. This is accomplished by sending a NEL header field to the user agent with new values for the policy endpoint group, time duration, and subdomain applicability. Thus, the user agent MUST store the "freshest" NEL policy information on behalf of an NEL origin, and specifying a zero time duration MUST cause the user agent to delete the NEL policy (including any asserted include-subdomains field) for that NEL origin.

Reporting

A network error is any condition where a connection or a protocol error is encountered by the user agent, thus preventing it from successfully completing the request-response exchange. This may include, but is not limited to DNS, TCP, TLS, and HTTP connection and protocol errors. For example, a network error is triggered when the user agent:

The user agent MAY classify and report server error responses (5xx status code) as network errors. For example, a network error report may be triggered when a fetch fails due to proxy or gateway errors, service downtime, and other types of server errors.

The failure to fetch a resource when the user agent is known to be offline (when navigator.onLine returns `false`) MUST NOT be considered to be a network error.

Note that the above definition of "network error" is different from definition in [[Fetch]]. The definition of network error in this specification is a subset of [[Fetch]] definition - i.e. all of the above conditions would trigger a "network error" in [[Fetch]] processing, but conditions such as blocked requests due to mixed content, CORS failures, etc., would not.

When a network error occurs for a URL that belongs to a known NEL origin the user agent SHOULD log the error and queue it for delivery to the endpoint group defined by the NEL policy of the associated NEL origin.

Delivery of reports, including scheduling, retrying, or abandoning delivery, is handled by the Reporting API, not by Network Error Logging.

To generate a network error object, the user agent MUST use an algorithm equivalent to the following:

  1. Prepare a JSON object neterror with the following keys and values:
    uri
    The URL that encountered the network error, with any fragment component removed.
    referrer
    The referrer information of the request, as determined by the referrer policy associated with its client.
    server-ip
    The IP address of the host to which the user agent sent the request, if available. Otherwise, an empty string.
    • A host identified by an IPv4 address is represented in dotted-decimal notation (a sequence of four decimal numbers in the range 0 to 255, separated by "."). [[RFC1123]]
    • A host identified by an IPv6 address is represented as an ordered list of eight 16-bit pieces (a sequence of `x:x:x:x:x:x:x:x`, where the 'x's are one to four hexadecimal digits of the eight 16-bit pieces of the address). [[RFC4291]]
    protocol
    The network protocol used to fetch the resource as identified by the ALPN Protocol ID, if available. Otherwise, an empty string.
    status-code
    The status code of the HTTP response, if available. Otherwise, the number 0.
    elapsed-time
    The elapsed number of milliseconds between the start of the resource fetch and when it was aborted by the user agent.
    type
    The description of the error type, which may be one the following strings:
    dns.unreachable
    DNS server is unreachable
    dns.name_not_resolved
    DNS server responded but is unable to resolve the address
    dns.failed
    Request to the DNS server failed due to reasons not covered by previous errors
    tcp.timed_out
    TCP connection to the server timed out
    tcp.closed
    The TCP connection was closed by the server
    tcp.reset
    The TCP connection was reset
    tcp.refused
    The TCP connection was refused by the server
    tcp.aborted
    The TCP connection was aborted
    tcp.address_invalid
    The IP address is invalid
    tcp.address_unreachable
    The IP address is unreachable
    tcp.failed
    The TCP connection failed due to reasons not covered by previous errors
    tls.version_or_cipher_mismatch
    The TLS connection was aborted due to version or cipher mismatch
    tls.bad_client_auth_cert
    The TLS connection was aborted due to invalid client certificate
    tls.cert.name_invalid
    The TLS connection was aborted due to invalid name
    tls.cert.date_invalid
    The TLS connection was aborted due to invalid certificate date
    tls.cert.authority_invalid
    The TLS connection was aborted due to invalid issuing authority
    tls.cert.invalid
    The TLS connection was aborted due to invalid certificate
    tls.cert.revoked
    The TLS connection was aborted due to revoked server certificate
    tls.cert.pinned_key_not_in_cert_chain
    The TLS connection was aborted due to a key pinning error
    tls.protocol.error
    The TLS connection was aborted due to a TLS protocol error
    tls.failed
    The TLS connection failed due to reasons not covered by previous errors
    http.protocol.error
    The connection was aborted due to an HTTP protocol error
    http.response.invalid
    Response is empty, has a content-length mismatch, has improper encoding, and/or other conditions that prevent user agent from processing the response
    http.response.redirect_loop
    The request was aborted due to a detected redirect loop
    http.failed
    The connection failed due to errors in HTTP protocol not covered by previous errors
    abandoned
    User aborted the resource fetch before it is complete
    unknown
    error type is unknown

    The user agent MAY extend the above error type list with custom values - e.g. new error types to accommodate new protocols, or more detailed error descriptions of existing ones. When doing so, the user agent SHOULD follow the dot-delimited pattern (`[group].[optional-subgroup].[error-name]`) to facilitate simple and consistent processing of the error reports - e.g. the collector may provide aggregation by category and/or one or multiple subgroups.

  2. Return neterror.

Examples

Sample Policy Definitions

> GET / HTTP/1.1
> Host: example.com

< HTTP/1.1 200 OK
< ...
< NEL: {"report-to": "network-errors", "max-age": 2592000}
        

The above NEL policy provided in the server response specifies that the user agent should register a new NEL policy, or update an existing one if one already exists, for the `example.com` NEL origin: the user agent should report network errors to the endpoint group "network-errors" and the policy applies for 2592000 seconds (30 days).

Note that above registration will only succeed if the response is communicated from a potentially trustworthy origin.

> GET / HTTP/1.1
> Host: example.com

< HTTP/1.1 200 OK
< ...
< NEL: {"report-to": "network-errors", "max-age": 2592000, "include-subdomains": true}
        

The above NEL policy provided in the server response specifies that the user agent should report network errors to the endpoint group "network-errors". Further, the policy is extended to all of the subdomains of the issuing NEL origin — see include-subdomains.

> GET / HTTP/1.1
> Host: example.com

< HTTP/1.1 200 OK
< ...
< NEL: {"max-age": 0}
        

The above NEL policy provided in the server response contains max-age set to zero, which indicates that the user agent must delete the current registered NEL policy associated with the `example.com` NEL origin and all of its subdomains:

Sample Network Error Reports

This section contains an example network error report the user agent might queue when a network error is encountered for a known NEL origin.

{
        "nel-report": [
            {
              "uri": "https://www.example.com/",
              "referrer": "http://example.com/",
              "server-ip": "123.122.121.120",
              "protocol": "h2",
              "status-code": 200,
              "elapsed-time": 823,
              "age": 0,
              "type": "http.protocol.error"
            }
          ]
        }
        

The above report indicates that the user agent attempted to navigate from "example.com" to "www.example.com" (known NEL origin), which successfully resolved to the "123.122.121.120" IP address. However, while the user agent received a "200" response from the server via the "h2" protocol, it encountered a protocol error in the exchange and was forced to abandon the navigation. 823 milliseconds elapsed between the start of navigation and when the user agent aborted the navigation. Finally, the user agent sent this report immediately after the network error was encountered - i.e. the report age is 0.

{
        "nel-report": [
            {
              "uri": "https://widget.com/thing.js",
              "referrer": "https://www.example.com/",
              "server-ip": "234.233.232.231",
              "protocol": "",
              "status-code": 0,
              "elapsed-time": 143,
              "age": 0,
              "type": "http.dns.name_not_resolved"
            }
          ]
        }
        

The above report indicates that the user agent attempted to fetch "https://widget.com/thing.js", which belongs to a previously registered NEL origin, from "www.example.com" origin. However, the user agent was unable to resolve the DNS name and the request was aborted by the user agent after 143 milliseconds. Because "widget.com" is a known NEL origin, a network error report was logged and sent to the report URL specified by the NEL policy of that host immediately after the network error was encountered - i.e. the report age is 0.

Use cases

Reporting of Navigation Failures

A navigation request initiated by the user (e.g. via a click on a link, direct input via the location bar, script-initiated due to user interaction, etc.) may fail due any number of connectivity reasons: DNS failure, TCP error, TLS protocol violation, and so on. These errors may be caused by network misconfiguration, transient routing issues, server downtime, malware or other attacks against the user, etc.

In such cases the destination host is often left unaware of the failed navigation since, by definition, it cannot see the request reach its infrastructure and it is unable to investigate the problem. To address this, the host can register an NEL policy with the user agent, which specifies where reports of such failures should be delivered such that they can be investigated.

Reporting of First-party Subresource Fetch Failures

A typical application requires dozens of resources, the fetching of which is typically initiated via HTML, CSS, or JavaScript. The application requesting such resources can observe failures of most such fetches (e.g. via `onerror` callbacks), but it does not have access to the detailed network error report of why the failure has occurred - e.g. DNS failure, TCP error, TLS protocol violation, etc.

To address this, the application can register relevant NEL policies with the user agent for the first-party hosts from which the subresources are being fetched. Then, if such a policy is present and a network error is encountered for a resource associated with a registered NEL origin, the user agent will report the detailed network error report and enable the application developers to investigate the error.

Reporting of Third-party Subresource Fetch Failures

In the case where a resource is embedded by a third party, the provider of the resource is often unable to instrument and observe the failure. For example, if `example.com` embeds a `widget.com/thing.js` resource on its site, and the user visiting `example.com` fails to fetch such resource due to a network error, the `widget.com` host is both unaware of the failure and unable to detect it.

To address this, `widget.com` can register an NEL policy for its host. Then, if such policy is present and a network error is encountered while fetching a resource — regardless of whether it is being requested from a first-party or third-party origin — from the registered NEL origin, the user agent will report the network error and enable the provider to investigate the error.

Privacy Considerations

NEL provides network error reports that could expose new information about the user's network configuration. For example, an attacker could abuse NEL reporting to probe users network configuration. Also, similar to HSTS, HPKP, and pinned CSP policies, the stored NEL policy could be used as a "supercookie" by setting a distinct policy with a custom (per-user) reporting URI to act as an identififer in combination with (or instead of) HTTP cookies.

To mitigate some of the above risks, NEL registration is restricted to trustworthy origins, and delivery of network error reports is similarly restricted to trustworthy origins. This disallows a transient HTTP MITM from trivially abusing NEL as a persistent tracker.

In addition to above restrictions, the user agents MUST:

When deploying NEL the developer SHOULD consider privacy implications of NEL reports delivered to the specified collectors. For example, reports may contain URLs with sensitive data (e.g. "Capability URLs") that may need special precautions (see [[!CAPABILITY-URLS]]), and may require the developer to operate their own NEL collectors to prevent reporting of such URLs to third parties.

IANA Considerations

The permanent message header field registry should be updated with the following registrations ([[RFC3864]]):

NEL

Header field name
NEL
Applicable protocol
http
Status
standard
Author/Change controller
W3C
Specification document
This specification (see NEL Header Field)

Acknowledgments

This document reuses text from the [[CSP]] and [[RFC6797]] specification, as permitted by the licenses of those specifications. Additionally, sincere thanks to Julia Tuttle, Chris Bentzel, Todd Reifsteck, Aaron Heady, and Mark Nottingham for their helpful comments and contributions to this work.