When playing audio through WebAudio, we want to be able to measure the delay of that audio and the glitchiness of the audio. This document contains a proposal of an API that would allow WebAudio users to do this.
This is an unofficial proposal.
There is currently no way to detect whether WebAudio playout has glitches (gaps in the played audio, which typically happens due to underperformance in the audio pipeline). There is an existing way to measure the instantaneous playout latency using AudioContext.outputLatency, but no simple way to measure average/minimum/maximum latency over time.
Glitches and high latency are bad for the user experience, so if any of these occur it can be useful for the application to be able to detect this and possibly take some action to improve the playout.
The {{AudioContext/AudioPlayoutStats}} under {{AudioContext}} is a dedicated object for statistics reporting; Similar to RTCAudioPlayoutStats, but it is for the playout path via {{AudioDestinationNode}} and the associated output device. This will allow us to measure glitches occurring due to underperforming {{AudioWorklet}}s as well as glitches and delay occurring in the playout path between the {{AudioContext}} and the output device.
Audio glitches are expressed in terms of fallback frames and fallback events.
partial interface AudioContext { [SameObject] readonly attribute AudioPlayoutStats playoutStats; };
{{AudioContext}} has the following internal slot: [[\playout stats]], An instance of {{AudioContext/AudioPlayoutStats}}, initially null.
When accessing this attribute, run the following steps:
Attributes
playoutStats attribute
When the {{AudioContext}} is in the "running" state, once per second, the [= update playout stats =] algorithm runs on {{AudioContext/[[playout stats]]}}.
[Exposed=Window, SecureContext] interface AudioPlayoutStats { readonly attribute DOMHighResTimeStamp fallbackFramesDuration; readonly attribute unsigned long fallbackFramesEvents; readonly attribute DOMHighResTimeStamp totalFramesDuration; readonly attribute DOMHighResTimeStamp averageLatency; readonly attribute DOMHighResTimeStamp minimumLatency; readonly attribute DOMHighResTimeStamp maximumLatency; undefined resetLatency(); [Default] object toJSON(); };It has the following internal slots:
A timestamp representing the total duration of fallback frames that the {{AudioContext}} has played as of the last stat update. Initialized to 0.
These attributes update only once per second and under specific conditions. See the Privacy & Security mitigations section and the [= update playout stats =] algorithm for details.
Returns {{AudioPlayoutStats/[[fallback frames duration]]}}.
Measures the duration of [=fallback frame=]s played by the {{AudioContext}}, in milliseconds. This metric can be used together with {{AudioPlayoutStats/totalFramesDuration}} to calculate the percentage of played out media that was not provided by the {{AudioContext}}.
Returns {{AudioPlayoutStats/[[fallback frames events]]}}.
This measures the number of [=fallback event=]s that have occurred in the {{AudioContext}}.
Returns {{AudioPlayoutStats/[[total frames duration]]}}.
Measures the duration of all audio frames played by the {{AudioContext}}, in milliseconds. Includes both fallback and non-fallback frames.
Returns {{AudioPlayoutStats/[[average latency]]}}.
This is the average latency for the frames played since the last call to {{AudioPlayoutStats/resetLatency()}}, or since the creation of the {{AudioContext}} if {{AudioPlayoutStats/resetLatency()}} has not been called.
Returns {{AudioPlayoutStats/[[minimum latency]]}}.
This measures the minimum latency for the frames played since the last call to {{AudioPlayoutStats/resetLatency()}}, or since the creation of the {{AudioContext}} if {{AudioPlayoutStats/resetLatency()}} has not been called.
Returns {{AudioPlayoutStats/[[maximum latency]]}}.
This measures the maximum latency for the frames played since the last call to {{AudioPlayoutStats/resetLatency()}}, or since the creation of the {{AudioContext}} if {{AudioPlayoutStats/resetLatency()}} has not been called.
This method resets the latency counters. Note that it does not remove latency information that has accrued but not yet been exposed through the API.
When resetLatency()
is called, run the [= reset latency stats =] steps.
Running the initialize playout stats algorithm means running these steps:
Running the reset latency stats algorithm on an {{AudioPlayoutStats}} means running these steps:
Running the update playout stats algorithm on an {{AudioPlayoutStats}} means running these steps:
var oldTotalFramesDuration = audioContext.playoutStats.totalFramesDuration; var oldFallbackFramesDuration = audioContext.playoutStats.fallbackFramesDuration; var oldFallbackFramesEvents = audioContext.playoutStats.fallbackFramesEvents; audioContext.playoutStats.resetLatency(); // Wait while playing audio ... // the number of seconds that were covered by the frames played by the output device between the two executions. let deltaTotalFramesDuration = (audioContext.playoutStats.totalFramesDuration - oldTotalFramesDuration) / 1000; let deltaFallbackFramesDuration = (audioContext.playoutStats.fallbackFramesDuration - oldFallbackFramesDuration) / 1000; let deltaFallbackFramesEvents = audioContext.playoutStats.fallbackFramesEvents - oldFallbackFramesEvents; // fallback frames fraction stat over the last deltaTotalFramesDuration seconds let fallbackFramesFraction = deltaFallbackFramesDuration / deltaTotalFramesDuration; // fallback event frequency stat over the last deltaTotalFramesDuration seconds let fallbackEventFrequency = deltaFallbackFramesEvents / deltaTotalFramesDuration; // Average playout delay stat during the last deltaTotalFramesDuration seconds let playoutDelay = audioContext.playoutStats.averageLatency / 1000;
See discussion with Privacy WG.
The glitch information provided by the API could be used to form a cross-site covert channel between two cooperating webstites. One site could transmit information by intentionally causing audio glitches (by causing very high CPU usage, for example) while the other site could use the API to detect these glitches.
To inihibit the usage of this covert channel, implementers should apply these mitigations.
Implementers should not update the values returned by the API more often than once per second. This limits the bandwidth of the covert channel.
Implementers should only provide access to the API to sites that fulfill at least one of the following criteria:
The reasoning is that if a site has obtained getUserMedia permission, it can receive glitch information more efficiently through use of the microphone. These methods include:
Assuming that neither cooperating site has microphone permission, this criteria ensures that the site that receives the covert signal must be visible, restricting the conditions under which the covert channel can be used. It makes it impossible for sites to communicate with each other using the covert channel while not visible.