This explainer proposes changes to the Storage Standard.
The core of the proposal is granting sites the ability to create multiple storage buckets, where the user agent may choose to delete each bucket independently of other buckets. By contrast, today’s user agents have a binary choice of either persisting or deleting all the data stored by a site.
This proposal also entails designating buckets as the recommended unit for managing existing storage policy (quota) and new storage policies, such as expiration and persistence. Each storage bucket can store data associated with established storage APIs such as IndexedDB and CacheStorage, so defining storage policies at the bucket level avoids the need of introducing mechanisms for specifying the policies for each individual API.
Allow web applications to evict slices of data
Allow web developers to specify eviction prioritization
Allow web applications to easily evict service workers without clearing data for the entire domain
Allow web developers to express performance, durability and other trade-off decisions on slices of data
Integration with storage mechanisms that don’t follow the same origin policy, such as cookies
Behavior in third-party contexts is deferred to future work, to keep this explainer manageable
Standardizing user agent behavior around storage eviction
Allowing users to have control over which data to evict
See Key Scenarios for more detail on use cases.
Storage eviction: by allowing applications to have more control over prioritization and organization, it allows them to decide on storage trade-offs themselves upon storage eviction during low disk space instead of losing all data.
Storage division: allowing applications to divide and organize their data. Whether it’s by user account on a shared device or by feature, web applications can choose to divide their data into slices via buckets however they would like.
Quota management: applications can be smart with quota usage by keeping track of quota usage per bucket and reserving quota before a write, preventing errors when a user agent does not have enough disk space. Application can also proactively choose to reduce low priority writes or evict low priority buckets.
These use cases aim to help with the following class of applications that store a lot of user data and aim to retrieve certain data without delay for a smooth user experience.
The Storage Standard already introduces buckets, but does not have an API for explicitly managing buckets.
This explainer introduces the navigator.storageBuckets.open()
method.
Applications are expected to use this method to deliberately set up buckets
before using storage APIs. In the simplest form, usage looks as follows.
// Create a bucket for emails that are synchronized with the server.
const inboxBucket = await navigator.storageBuckets.open("inbox");
Buckets can be assigned different storage policies at creation time. The example below demonstrates some policies appropriate for data that is not (yet) synchronized with a server. The policies introduced by this proposal will be described in future sections.
const draftsBucket = await navigator.storageBuckets.open("drafts", {
durability: "strict", persisted: true });
The storage policies passed to open()
are advisory. User agents may
create buckets whose policies don’t match the requests. In most cases, the
deviations only result in different performance characteristics. Applications
can check a bucket’s policies and take appropriate action when a vital policy
does not match the desired value.
const draftsBucket = await navigator.storageBuckets.open("drafts", {
durability: "strict", persisted: true });
if (await draftsBucket.persisted() !== true) {
showWarningButterBar("Your drafts may be lost if you run out of disk space!");
}
Each open()
option that indicates a storage policy has a
corresponding property on the bucket object. Examples for all policy-related
properties will be shown in future sections.
User agents may add entry points to the following storage APIs if they choose to implement them.
Each storage bucket can have an entry point to the
Cache storage API. The
entry point matches WindowOrWorkerGlobalScope.caches
in
the Service Worker spec.
const inboxCache = await inboxBucket.caches.open("attachments");
const draftsCache = await draftsBucket.caches.open("attachments");
Each storage bucket can also have an entry point to the
IndexedDB API. The entry point matches
WindowOrWorkerGlobalScope.indexedDB
in
the IndexedDB spec.
const inboxDb = await new Promise(resolve => {
const request = inboxBucket.indexedDB.open("messages");
request.onupgradeneeded = () => { /* migration code */ };
request.onsuccess = () => resolve(request.result);
request.onerror = () => reject(request.error);
});
const draftsDb = await new Promise(resolve => {
const request = draftsBucket.indexedDB.open("messages");
request.onupgradeneeded = () => { /* migration code */ };
request.onsuccess = () => resolve(request.result);
request.onerror = () => reject(request.error);
});
Each storage bucket can also have an entry point to the File API. The entry points are asynchronous versions of the Blob constructor and the File constructor. File API object created in a bucket are charged against the bucket’s quota.
const draftBlob = await draftsBucket.createBlob(
["Message text."], { type: "text/plain" });
const draftFile = await draftsBucket.createFile(
["Attachment data"], "attachment.txt",
{ type: "text/plain", lastModified: Date.now() });
TODO: Update the text here with the resolution of https://github.com/w3c/FileAPI/issues/157.
Each storage bucket can also have an entry point to the origin-private file
system in the
File System Access API.
The entry point matches StorageManager.getDirectory()
in the File System Access spec.
const inboxTestDir = await inboxBucket.getDirectory();
const draftsTestDir = await draftsBucket.getDirectory();
Each storage bucket can also have an entry point to
the Web Locks API.
The entry point matches NavigatorLocks.locks
in
the Web Locks API spec.
inboxBucket.locks.request("cache", lock => {
return new Promise((resolve, reject) => {
const tx = inboxDb.transaction("attachments", "readonly");
tx.oncomplete = resolve;
tx.onabort = e => reject(tx.error);
// use tx...
});
});
Storage buckets can be deleted. For example, the code below could be used to delete all the data stored on the device when the user logs out.
await navigator.storageBuckets.delete("user-1234");
A bucket’s data becomes inacessible by the time the deletion operation completes. For example, when deleting a bucket, all its IndexedDB databases will be force-closed.
An origin can get a list of all its storage buckets.
const bucketNames = await navigator.storageBuckets.keys();
console.log(bucketNames); // [ "drafts", "inbox" ]
This function is provided for debugging / logging purposes, and may have significant performance implications.
In order to support eviction at bucket granularity, user agents are expected to track quota usage for the data associated with each bucket. Applications can get this information using an API similar to StorageManager.estimate().
const inboxEstimate = await inboxBucket.estimate();
if (inboxEstimate.usage >= inboxEstimate.quota * 0.95) {
displayWarningButterBar("Go to settings and sync fewer days of email");
}
The storage specification currently endows each bucket with a
mode, which can be persistent
or best-effort
. A persistent bucket will not be evicted without user notice
when the user agent experiences
storage pressure. User agents
may also distinguish persistent buckets in their storage management UIs. For
example, Chrome presents an additional warning when a user chooses to delete
persistent storage.
This explainer proposes replacing the internal mode concept with a persistence policy, in the interest of arriving at a uniform model for bucket behaviors. We also propose the following API for operating on a bucket’s persistence policy.
A bucket’s persistence policy is specified at bucket creation time.
const draftsBucket = await navigator.storageBuckets.open("drafts", {
persisted: true });
The persistence policy can be queried at any time. The user agent may decline a
persisted: true
policy requested by open()
.
if (await draftsBucket.persisted() !== true) {
showButterBar("Your email drafts may be lost if you run out of disk space");
}
The application can attempt to make a bucket persistent. The user agent may decline the request.
if (await draftsBucket.persisted() !== true) {
const butterBar = showWarningButterBar(
"Your email drafts may be lost if you run out of disk space. Fix?");
butterBar.onFixClicked = () => {
if (await draftsBucket.persist() === true) {
butterBar.hide();
showButterBar("Your drafts are now safe! \o/");
}
}
}
A bucket’s durability policy is a hint that helps the user agent trade off write performance against a reduced risk of data loss in the event of power failures.
The policy has the following values.
"strict"
buckets attempt to minimize the risk of data loss on power failure.
This may come at the cost of reduced performance, meaning that writes may take
longer to complete, might impact overall system performance, may consume more
battery power, and may wear out the storage device faster.
"relaxed"
buckets may “forget” writes that were completed in the last few
seconds, when a power loss occurs. In return, writing data to these buckets
may have better performance characteristics, and may allow a battery charge
to last longer, and may result in longer storage device lifetime. Also,
power failures will not lead to data corruption at a higher rate than for
"strict"
buckets.
In general, "strict"
buckets are intended to store data created by the user
that has not been synchronized with the application’s server. In this case, the
application would not be able to recover from power loss. By contrast,
"relaxed"
buckets are most suitable for caches that can be repopulated easily.
A bucket’s durability policy is specified at bucket creation time.
const draftsBucket = await navigator.storageBuckets.open("drafts", {
durability: "strict" });
The durability policy can be queried at any time. The user agent may not
honor the policy requested by the open()
call.
if (await draftsBucket.durability() !== "strict") {
showWarningButterBar("Your email drafts may be lost if you run out of power");
}
A bucket’s durability policy cannot be changed once the bucket is created.
A bucket’s quota policy allows setting a per-bucket quota which can be used to place an upper bound on storage usage for each application feature. This ensures that a bug in an application feature won’t impact another feature’s ability to store data by eating up the entire origin’s quota.
A quota argument passed to open
is a hint, and user agents may choose
not to follow it.
const logsBucket = await navigator.storageBuckets.open("logs", {
quota: 20 * 1024 * 1024 // 20 MB
}
A bucket’s quota can be read using (await logsBucket.estimate()).quota
.
See Getting a bucket’s quota usage for more details on querying quota.
A bucket’s expiration polict ensures that the bucket’s data isn’t available to the site after a certain expiration time. This policy offers a similar capability to the expiration attribute of HTTP cookies.
const twoWeeks = 14 * 24 * 60 * 60 * 1000;
const newsBucket = await navigator.storageBuckets.open("news", {
expires: Date.now() + twoWeeks });
A bucket’s expiration can be queried at any time.
if ((await newsBucket.expires()) === null) {
// This should not happen. The browser must always honor the expires policy.
showWarningButterBar("");
}
A bucket’s expiration can be changed as long as the bucket does not expire.
const oneDay = 24 * 60 * 60 * 1000;
if (await newsBucket.expires() - Date.now() <= oneDay) {
await refreshNews(newsBucket);
await newsBucket.setExpires(Date.now() + twoWeeks);
}
User agents may continue storing the data associated with expired buckets that are not accessed by applications. While an origin’s expired buckets remain stored, they count towards the origin’s quota. The following operations are guaranteed to cause the deletion of an expired bucket.
Calling navigator.storageBuckets.delete()
with the name of an expired
bucket will delete the bucket’s data.
Calling navigator.storage.open()
with the name of an expired bucket
will delete the expired bucket’s data, and return a newly created bucket.
Calling navigator.storageBuckets.keys()
will delete the data of all the
origin’s expired buckets.
The following existing APIs, if a user agent chooses to support them,
will operate on the default
bucket. These APIs create the
default
bucket on-demand.
WindowOrWorkerGlobalScope.indexedDB
in
IndexedDBWindowOrWorkerGlobalScope.caches
in
ServiceWorkerNavigatorStorage.storage
in
the Storage StandardStorageManager.getDirectory
in
File System AccessNavigatorLocks.locks
in
Web LocksThe default bucket is created with the following options.
await navigator.storageBuckets.open("default", {
durability: "strict", persist: false });
TODO: Flesh out this section or move it into a separate explainer.
Each storage bucket can store service worker registrations. When a storage bucket is deleted, all service worker registrations that it contains are also evicted, using the same purge method used by Clear-Site-Data.
Applications are expected to associate a service worker registration with a storage bucket if the service worker would not be able to do its job without the data in the bucket. This way the user agent won’t have to spend system resources on waking up a service worker, only to find out that the service worker cannot fulfill the request given to it.
const inboxRegistration = await inboxBucket.serviceWorker.register(
"/inbox-sw.js", { scope: "/inbox" });
const draftsRegistration = await draftsBucket.serviceWorker.register(
"/drafts-sw.js", { scope: "/drafts" });
Storage buckets expose access to their service workers via the following subset of the ServiceWorkerContainer methods.
Storage policies do not affect the service worker registrations of a given bucket.
TODO: Flesh out this section or move it into a separate explainer.
Clear-Site-Data currently
supports deleting all DOM-accessible storage via the storage
type. We propose
adding the ability to clear individual buckets using a family of types looking
like storage:bucket-name
.
For example, receiving an HTTP response with the following header would cause
the inbox
bucket to be deleted.
Clear-Site-Data: "storage:inbox"
The above Clear-Site-Data
storage bucket directive is functionally equivalent to calling delete()
.
navigator.storageBuckets.delete("inbox");
This is a list of scenarios which we hope buckets will help applications solve.
Currently during storage eviction, the browser will delete the entire origin’s data. This can cause a broken experience for users and does not allow for applications to best mitigate this poor experience. Buckets aims to give applications more control of prioritizing storage during eviction and decide on these tradeoffs themselves under storage pressure.
In this example, a document client will use Buckets to set different
priorities for different types of documents during storage pressure. The
recent
bucket stores recently accessed documents that have already been
uploaded to the server but have been cached locally for easy user access. The
drafts
bucket stores drafts of documents that have been made offline, which
have not yet but uploaded to the server and are irrecoverable if lost.
In this scenario, the drafts
bucket is created with persisted: true
and
durability: "strict"
to specify that it should be evicted last upon storage
pressure, and all data should survive power failures at the cost of more
battery consumption. The recent
bucket will be evicted first because of its
low priority since it contains data that can be recovered from the server.
TODO: Add image
const recentBucket = await navigator.storageBuckets.open("recent",
{ durability: "relaxed", persisted: false });
const draftsBucket = await navigator.storageBuckets.open("drafts",
{ durability: "strict", persisted: true });
Currently there isn’t any way for applications to divide data, making it hard for applications to organize and do partial cleanup on divisions of data. Buckets allow applications to divide their data into slices by feature, account, or however they seem fit.
In this example we will explore a scenario where an email client will divide storage using Buckets by user accounts on a shared device.
TODO: Add image
// Bucket creation per user account.
const user1Bucket = await navigator.storageBuckets.open(
"userid111111_inbox" });
const user2Bucket = await navigator.storageBuckets.open(
"userid222222_inbox" });
Data can be stored for each user via Storage APIs per bucket without commingled user data.
// Attachments for User 1
const attachments = user1Bucket.caches.open("attachments");
// Messages for User 2
const req = user2Bucket.indexedDB.open("inbox", 1);
… // Retrieving cached inbox messages.
When Bob does not login for 30 days on the device, an application can easily delete all storage associated with Bob, while keeping data from Alice undisrupted.
await navigator.storageBuckets.delete("user2_inbox");
The following example shows how a video streaming service can use Buckets and its quota APIs to assign storage limits and reserve quota for individual buckets to better control their quota usage and reserve space for certain data.
In this example, Bucket A will store videos users have explicitly downloaded onto their device for offline access. Bucket B will store data for user recommendations. The service can set a quota for Bucket B to ensure that it won’t impact Bucket A’s ability to store data by eating up the entire origin’s quota.
TODO: Add image.
const offlineVideosBucket = await navigator.storageBuckets.open(
"offline_videos",
{ durability: "strict", persisted: false });
const recommendationBucket = await navigator.storageBuckets.open(
"recommendations", {
quota: 20 * 1024 * 1024, // 20 MB
});
Query quota usage to check if a user can download more videos.
const offlineVideosEstimate = await offlineVideosBucket.estimate();
if (offlineVideosEstimate.usage >= offlineVideosEstimate.quota * 0.95) {
displayWarningButterBar("Delete old downloaded videos to download more");
}
We propose restricting developer-facing bucket names to strings made up of a very small set of characters.
Here are the characters we propose allowing.
a
- z
0
- 9
-
and _
can be used in the middle of the name, but
not at the beginningThe restrictions were chosen with two goals in mind.
Avoid gotchas when including a bucket name in a Clear-Site-Data header. Embedded unrestricted strings in HTTP headers requires escaping. Developers unaware of the limitations (or pressed by deadlines) may use simple string interpolation instead of proper escaping. Errors may not be caught early on, if initial bucket names happen to be safe to include in headers.
Give browsers the option to integrate bucket names in file names on the
computer’s file system. This may help user agents avoid a database lookup in
their open()
implementations. We expect that opening buckets will
end up on the critical path for loading modern sites, so we consider that
giving user agents maximum freedom in the name of efficiency serves users,
which outweighs developer convenience.
The desire for direct integration into file system names significantly constrains the character set. The constraints we are aware of are listed below.
.
separates file names from extensions, and files ending in
.exe
are executable on Windows.Bucket names are limited to 64 characters. This supports implementations that would directly integrate bucket names into file names, and makes it easy to reason about bucket lookup performance.
Here are the considerations used for storage policy naming.
persisted
was chosen for consistency with
StorageManager.persisted()
in the Storage specification. The true / false values are also consistent with
the definitions in the Storage specification.
durability
was chosen for consistency with
IDBTransaction.durability
in IndexedDB. The values "strict"
and "relaxed"
have the same significance
as the corresponding
IndexedDB transaction hints.
Data “written” by a storage API conceptually goes through the following four stages.
1) The data starts out cached in an application-level buffer. If the current tab or the entire user agent crashes (due to a bug or resource exhaustion), the data is lost.
2) The data is flushed to an OS-level (usually kernel) buffer. At this point, the data will survive a tab or user agent crash. However, the data is lost if the entire operating system crashes (blue screen of death on Windows, kernel panic on UNIX systems).
3) The data is flushed to a storage device (HDD / SSD) buffer. At this point, the data will survive an OS crash. However, if the computer experiences a power failure, the data may be lost. Some storage devices have batteries with sufficient capacity to write the data in the buffers in case of power failure, but this is not generally guaranteed. Furthermore, most modern portable computers (laptops, tablets, mobile phones) have firmware / OS logic that mitigates power failures by suspending the computer’s activity and writing all the data in volatile buffers.
4) The data is persisted to the storage medium (disk platters for HDDs, non-volatile memory cells for SSDs). At this point, the data will surive a computer power failure.
Storage systems may differ in how they handle a write operation, such as commiting an IndexedDB transaction, along the following axes.
1) The stage at which the system reports that the write has completed. For example, many database systems offer the option of considering that a transaction has committed when the data is flushed to OS-level buffers.
2) The stage to which the data is pushed after the write has completed. In the example above, after a database system considers a transaction to have completed, it may ask the OS to flush the data to the storage device, or it can let the data stay in OS-level buffers until the OS decides to evict the data to the storage device.
Combining the stage at which a write is reported as completed with the stage to which the data is pushed results in 12 possibile behaviors for storing data. The behaviors with a higher risk of data loss result in better performance along a few axes such as write speed, battery usage, and storage medium wear.
The storage buckets API narrows down this complex space to only two options,
which are packaged as possible values for the durability
policy.
The strict
durability policy requires that the data is persisted to the
storage medium before writes are considered to have completed. This policy
results in lower performance, but guarantees that data will survive power
losses. Therefore, this policy is the right choice for user data that cannot
be recovered from an alternative source in the event of a power failure.
The relaxed
durability policy requires that the data is flushed to OS-level
buffers below writes are considered to have completed, and allows the data to
remain in OS-level buffers for an indefinite amount of time. This policy
results in better performance, at the cost of risking data loss. For this
reason, this policy is intended for data that can be easily obtained from an
alternative source, such as cached versions of data stored on the
application’s server.
We drew inspiration from SQLite and LevelDB, which are the two libraries used by the browsers that are popular at the time of this writing. The following facts were considered by our decision process.
SQLite allows choosing between flushing data to the operating system
(similarly to the relaxed
policy), and flushing it to the storage device or
media (similarly to the strict
policy) via the
synchronous PRAGMA.
This setting’s behavior is closely connected to whether a database uses
Write-Ahead Logging (WAL) or not, which
must be decided when the database is open via the
journal_mode PRAGMA.
SQLite allows choosing between flushing to the storage device and flushing to the media via the fullfsync PRAGMA and the checkpoint_fullfsync PRAGMA.
LevelDB allows choosing between the relaxed
policy and the strict
policy
at transaction level via the
WriteOptions.sync option.
A default quota will be assigned to every storage bucket that is created
with open()
without a quota
policy. The behavior of the
default bucket quota will be user agent specific.
Chrome plans to have the default quota for a storage bucket to match the origin quota. It may seem unintuitive to have the quota for a storage bucket be so large, disconnecting further from available disk space. However, the Chrome team thinks that having anything under 100% of the origin quota will become a constraint to developers, disincentivizing the use of buckets. The Chrome team thinks developers should be able to use all available quota for an origin in one storage bucket.
Instead of exposing the storage buckets API off of navigator.storageBuckets
,
we have exposed it off of navigator.storage.buckets
. The examples below
illustrate this alternative.
const inboxBucket = await navigator.storage.buckets.open("inbox");
const draftsBucket = await navigator.storage.buckets.open("drafts", {
durability: "strict", persisted: true });
await navigator.storage.buckets.delete("inbox");
await navigator.storage.buckets.delete("drafts");
navigator.storage.buckets
was rejected to avoid developer confusion around
nesting and the default bucket. Specifically, some navigator.storage
methods
(estimate()
, persist()
and persisted()
) operate on the default bucket, so
navigator.storage
seems connected to the default bucket.
navigator.storage.buckets
is a property on navigator.storage
, and it may be
confusing that it refers to all the origin’s buckets, not to the default bucket.
navigator.storageBuckets.open()
always attempts to create a bucket
with the given name if it does not exist. This is different from storage APIs on
most systems, where developers can express the three separate intents below.
1) Open the named bucket if it exists, create it if it does not exist. 2) Open the named bucket if it exists, fail if it does not exist. 3) Fail if the named bucket exists, create it if it does not exist.
Allowing all three intents to be expressed could have been accomplished by
having separate methods (openOrCreate()
, open()
, create()
), as
illustrated in the example below.
// Creates the "inbox" bucket if does not already exist.
const inboxBucket = await navigator.storageBuckets.openOrCreate("inbox");
// Fails if the "inbox" bucket does not already exist.
const inboxBucket = await navigator.storageBuckets.open("inbox");
// Fails if the "inbox" bucket already exists.
const inboxBucket = await navigator.storageBuckets.create("inbox");
Alternatively, we could have settled for one open()
method with options, as
shown below.
// By default, creates the "inbox" bucket if does not already exist.
const inboxBucket = await navigator.storageBuckets.open("inbox");
// Fails if the "inbox" bucket does not already exist.
const inboxBucket = await navigator.storageBuckets.open("inbox", {
failIfNotExist: true });
// Fails if the "inbox" bucket already exists.
const inboxBucket = await navigator.storageBuckets.open("inbox", {
failIfExists: true });
We think that exposing an open()
option that combines the intent
to open or create is the best way to support the model where each
bucket can be evicted by the browser independently of other buckets.
We want applications to be written assuming that each time they
attempt to open a bucket, they may be creating the bucket from scratch.
open
Instead of open
this could have been called openOrCreate
to clearly describe that a Storage Bucket could be created if it
does not exist yet. However the open
naming is consistent and more
recognizable across the storage APIs including
CacheStorage
which matches this behavior of auto creation if the Cache does not exist.
Instead of exposing entry points for each API on the bucket object, we could add integration points for buckets to each storage API. Examples below.
const inboxCache = caches.open("attachments", { bucket: "inbox" });
const inboxDb = await new Promise(resolve => {
const request = inboxBucket.indexedDB.open("messages", { bucket: "inbox" });
request.onupgradeneeded = () => { /* migration code */ };
request.onsuccess = () => resolve(request.result);
request.onerror = () => reject(request.error);
});
const inboxBlob = new Blob(
["Message text."], { type: "text/plain", bucket: "inbox" });
const inboxFile = new File(
["Attachment data"], "attachment.txt",
{ type: "text/plain", lastModified: Date.now(), bucket: "inbox" });
const inboxTestDir = await self.getDirectory({ bucket: "inbox" });
const inboxRegistration = await navigator.serviceWorker.register(
"/inbox-sw.js", { scope: "/inbox", bucket: "inbox" });
The API for creating buckets and specifying / querying policies wouldn’t change.
This alternative was rejected because of worse ergonomics for web developers. Web developers would have to specify bucket names in the options per storage API creating more opportunities for errors by either forgetting to specify a name or having typos in the bucket name.
Having entry points to each storage API on the bucket also makes replacing the default bucket in JS easy.
const inboxBucket = await navigator.storageBuckets.open("inbox");
// Replace default bucket with inboxBucket for IndexedDB.
window.indexedDB = inboxBucket.indexedDB;
window.caches = inboxBucket.caches;
navigator.storage.estimate = inboxBucket.estimate.bind(inboxBucket);
// ...
Bucket names are currently restricted to characters that are safe for HTTP headers and file systems. An alternative would have been to allow all safe characters allowed for HTTP headers defined by RFC 7230. This would allow for the following characters.
a
- z
A
- Z
0
- 9
!
#
$
%
&
'
*
+
-
.
^
_
`
~
(),/:;<=>?@[\]{}
Allowing for these characters would mean that user agents would need to do expensive database lookups for bucket names before transactions. Although this restriction is not flexible for developers, we think that the performance benefits would be better for users.
Bucket names are currently limited to 64 characters. We could remove this limitation, and rely on the ecosystem to evolve its own rules. User agents would most likely provide some guidance that would evolve based on developer needs.
The alternative of not having any length restriction at all was rejected because our previous experience strongly suggests the need for guardrails.
IndexedDB does not (at the time of this writing) have limitations on key sizes, and some developers did try using very large (multi-megabyte) keys. IndexedDB implementations use these keys as primary keys in a database, and large primary keys have lead to out-of-memory crashes and performance cliffs.
In summary, in the absence of immediate feedback at the prototyping stage, developers will use very large identifiers, even if that may result in a bad user experience.
Bucket names are currently limited to 64 characters. This restriction may
inconvenience developers, especially if deep hierarchies are in use. For
example, user123456789_inbox_label123456789
has 34 characters, which uses more
than 50% of the length budget. Relaxing the name lenth limit to 1024 characters
would remove developer concerns.
This alternative was rejected in the interest of avoiding unnecessary complexity. We are aware of implementation techniques for handling 1024-character bucket names efficiently, at the cost of some extra complexity in the user agent. This extra complexity still costs the user battery power (more code is being fetched and executed), as well as data transfer and storage (increased binary size). It will be far easier to increase the length limit (64 -> 1024) down the line than it would be reduce it (1024 -> 64), so we are proposing the stricter limit until we see concrete developer needs.
When reasoning about the potential increase in complexity, we considered the following implementation alternatives.
Rely on the underlying storage to handle the 1024-character keys. All major underlying systems we know (file system, SQLite, LevelDB) work better with small names. SQLite, LevelDB, and some file systems can handle 1024-character names.
Use a fast string hash, such as xxHash to map potentially long bucket names to very short hashed names. The hashed names are guaranteed to be very short, for example xxHash produces 8/16-byte hashes. However, fast string hashes may produce collisions, and allowing collisions will result in more complex storage code.
Use a strong cryptographic hash, such as SHA-256. Compared to fast string hashes, strong cryptographic hashes are larger (SHA-256 produces 32-byte outputs) and significantly slower. In return, cryptographic hashes guarantee vanishingly small collision rates, so the implementation does not need to handle collisions. We expect this to be the preferred alternative for handling long bucket names, because the overheads are small compared to the reduced complexity. Modern computers (both desktop and mobile) have hardware accelerators for computing cyptographic hashes, and hash output sizes are within the range of key sizes that yield good database performance.
Relaxing the length to 1024 characters makes it less likely that bucket names can be
Because buckets are expected to be named using programmer-friendly identifiers, similarly to variable names, we could have a bucket title option, which in contrast would be user-friendly descriptions of the bucket. Titles could be useful for user agents that want to offer the ability to delete individual buckets in their storage management UI. These user agents could display bucket titles when showing buckets to their users.
title
was chosen for consistency with the
HTML title element.
Bucket titles could present some subtleties for applications that support multiple languages. Specifically, a bucket’s title could presumably reflect the users’ preferred language at the time the bucket is created. In this alternative the API surface will not allow changing a bucket’s description, so already-created buckets will not reflect language preference changes.
This alternative was rejected because including strings provided by the author in
user agent UI may introduce a11y and i18n issues, as well as have non-trivial
security and privacy implications, which may deter some user agents from using
the title
as intended. For example, user agents that intend to incorporate
the title
need to mitigate against misleading values such as
"You have a virus! Go to www.evil.com for a cleanup"
. Currently, any use of
buckets is intended to be unnoticeable by end users.
title
propertyThe title
properties could be named description
instead. This would be
consistent with the
Web App Manifest description.
We preferred title
to description
because we think that title
invites
developers to be brief (1-3 words) whereas the latter appears to ask for a
full sentence. We expect that short strings will be better suited for inclusion
in user management UIs.
The bucket title
property could allow a dictionary instead of a string, where
the keys are valid values for the
lang attribute in the HTML specification,
and values are localized user-facing strings.
const draftsBucket = await navigator.storageBuckets.open("drafts", {
durability: "strict", persisted: true,
title: { en: "Drafts", es: "Borradoers", jp: "下書き" }});
The function used to enumerate buckets returns a sorted array of bucket names. It could have been specified to return an asynchronous iterator.
const bucketNames = [];
for await (let name of navigator.storage.buckets.keys())
bucketNames.push(name);
console.log(bucketNames); // [ "drafts", "inbox" ]
The main benefit of using an asynchronous iterator would be the ability to scale to a very large number of buckets.
This alternative was rejected because we expect that the number of buckets created by origins will not be large enough to run into memory limitations. In order to support per-bucket eviction, user agents will likely need
The durability
bucket property currently supports the storage policies
"relaxed"
and "strict"
. Instead of the "strict"
policy, we could have
provided separate policies for flushing the data to storage device buffers
("device"
) and for flushing the data to storage media ("media"
).
Under this alternative, advanced applications would take advantage of the
extra flexibility for increased performance, while simpler applications would
use "media"
, which is equivalent to "strict"
. The example below outlines
the code needed to use the extra storage policy, and is suggestive of the
additional complexity introduced by this alternative.
// Writes to this bucket are flushed to the storage device. The data here may be
// lost in case of power failure.
//
// Each change (keystroke) in a draft is saved here.
const immediateDraftsBucket = await navigator.storage.buckets.open(
"device-drafts",
{ durability: "device", persisted: true });
const immediateDraftsDb = await new Promise(resolve => {
const request = inboxBucket.indexedDB.open("messages", { bucket: "inbox" });
request.onupgradeneeded = () => { /* migration code */ };
request.onsuccess = () => resolve(request.result);
request.onerror = () => reject(request.error);
});
// Writes to this bucket are flushed to the storage media. The data here will
// not be lost on power failure.
//
// Draft changes are batched every minute and saved here. Writing to this bucket
// on every keystroke is too much of a battery drain.
const draftsBucket = await navigator.storage.buckets.open(
"media-drafts", { durability: "media", persisted: true });
const draftsDb = await new Promise(resolve => {
const request = inboxBucket.indexedDB.open("messages", { bucket: "inbox" });
request.onupgradeneeded = () => { /* migration code */ };
request.onsuccess = () => resolve(request.result);
request.onerror = () => reject(request.error);
});
// Accumulates changes that have been stored in immediateDraftsDb but not in
// draftsDb.
const batchedDrafts = [];
// Called on every draft change, which may happen on every user key stroke or
// mouse click.
async function saveDraft(draft) {
batchedDrafts.push(draft);
const transaction = immediateDraftsDb.transaction("messages", "readwrite");
const messageStore = transaction.objectStore("messages");
await new Promise((resolve, reject) => {
objectStore.put(draft);
transaction.commit();
transaction.oncomplete = resolve;
transaction.onerror = () => reject(transaction.error);
});
}
// Called every minute to write new draft changes to the persistent database.
async function flushDrafts() {
if (batchedDrafts.length === 0)
return;
// Swap the drafts queue to avoid double flushing.
const drafts = batchedDrafts;
batchedDrafts = [];
EliminateRedundantDraftChanges(batchedDrafts);
const transaction = db.transaction("messages", "readwrite");
const messageStore = transaction.objectStore("messages");
await new Promise((resolve, reject) => {
for (let draft of drafts)
objectStore.put(draft);
transaction.commit();
transaction.oncomplete = resolve;
transaction.onerror = () => reject(transaction.error);
});
}
setInterval(flushDrafts, 60 * 1000);
The alternative was rejected because we considered that the extra complexity is
not worth the performance benefits. Specifically, buckets using the "media"
policy would allow the storage device to batch the writes its internal buffers.
Under our proposed design, buckets that would have used the "media"
policy
will be indistinguishable from buckets that would have used the "device'
policy, the writes to these buckets will be flushed directly to storage media.
We are foregoing the following advantages.
Writes that must go straight to the storage media reduce the flexibility of the on-device I/O scheduler. This may impact the performance of seemingly unrelated I/O requests.
Writing to the storage media more often will wear out the media, reducing the life span of the storage device. This effect may be more significant in the lower end of the market, where devices have lower endurance specifications.
Writing to the storage media more often will consume extra battery power on mobile computers.
We were also influenced by the fact that most operating systems that are currently popular do not support the distinction between the two storage policies. The points below outline the current state of OS support.
On Linux-based systems, which include Android and ChromeOS,
fdatasync() flushes all the changes
in a file from OS-level buffers to to the storage device media, which is
consistent with the "media"
storage policy.
fsync() also flushes
file metadata that is not strictly needed to read the data, such as file
modification and access times.
On Windows, the CreateFile()
flag
FILE_FLAG_WRITE_THROUGH
is fairly similar to the "strict"
durability policy, as it requires that all
writes are flushed to the storage device.
FlushFileBuffers()
flushes a file’s changes to the storage device. The function
flushes to media in Windows 8+,
which is consistent with the "media"
policy. However, on Windows 7 and
below, variations in device drivers may cause the function to only flush to
storage device buffers, which is consistent with the "device"
policy.
On Darwin-based systems, which include macOS and iOS,
fsync()
flushes a file’s changes to the storage device buffers, which is consistent
with the "device"
storage policy. This behavior is compliant with the
POSIX 2018 specification for fsync(),
which only demands that the data be “transferred to the storage device”.
Flushing changes to the storage media platters, required by the "media"
policy, is accomplished by passing the
F_FULLFSYNC
flag to fcntl()
.
The durability
bucket property currently supports the storage policies
"relaxed"
and "strict"
. Instead of the "relaxed"
policy, we could have
provided separate policies for completing the write while the data is in
application-level buffers ("app"
) and for flushing the data to OS-level
buffers ("kernel"
).
At a high level, this alternative is about a tradeoff between offering applications more flexibility, in the name of performance, and offering a simpler storage model. So, the analysis here should be similar to the discussion around durability options for flushing to the storage device vs media, covered in the section above. However, this section is significantly shorter, because the performance benefits are much less significant.
This section does not have sample code, because we could not find any use case that would benefit from the separation proposed here. Instead, we’ll discuss the performance benefits offered by such a separation, and their relevance.
Writes to buckets with the "app"
storage policy would be considered complete
as soon as the data has been serialized in an application-level buffer inside
the user agent. Not having to wait for the data to be flushed to OS-level
buffers has the following advantages.
Flushing the data to the OS involves system calls, which perform CPU context switches. The context switches have non-trivial costs.
Flushing the data to the OS may require copying the data buffers, which can be expensive when writing large amounts of data.
This alternative was rejected because these advantages above are considered to be mostly theoretic, for the following reasons.
All usage of storage APIs may require system calls. Modern user agents have multi-process architectures. Most storage API features, such as transactions, require IPC between a process running the site’s JavaScript and a coordinating process. IPC requires system calls.
Data buffer copies may be avoided. All modern operating systems have shared memory features for avoiding large data copies during IPC. Modern operating systems also have direct I/O features that either allow the user agent to serialize data directly into an OS-level buffer, or allow applications to surrender ownership of buffers to the OS.
This decision was influenced the fact that application-level buffers are not used in the built-in SQLite VFS implementations, in the built-in LevelDB Env implementations, or in Chrome’s file abstraction. The main reason this option was consisdered is that fopen() in the POSIX standard is specified to use an application-level buffer, which may be flushed to an OS-level buffer using fflush().
Newly created buckets receive the "relaxed"
storage policy, unless a different
durability
option is passed to open()
. The default storage policy
could have been "strict"
. This alternative was rejected for two reasons,
outlined below.
First, we expect that Web applications will mostly use client-side storage to
cache data, where the authoritative copy is stored on the application server.
This use case is best served by the "relaxed"
policy. The example below shows
that this alternative leads to more bulky code for expressing the common case.
const inboxBucket = await navigator.storageBuckets.open("inbox", {
durability: "relaxed" });
const draftsBucket = await navigator.storageBuckets.open("drafts", {
persisted: true });
Second, we think that the current proposal will make it easier to reason about
correctness in a code review. Reviewers can assume that code that explicitly
mentions durability: "strict"
is operating on data that must survive power
outages, and can focus on recovery logic for this code. If we followed the
alternative, reviewers that encounter a bucket without a durability
option
would have to check if the author forgot to specify durability: "relaxed"
,
if the bucket stores data that cannot be recovered from another source.
TODO: Explain that durability guarantees apply to individual writes, but applications need to handle inconsistencies across storage APIs (Cache Storage and IndexedDB). We don’t offer two-phase commit across storage APIs.
Once a bucket is created, the value of its durability
policy is fixed. This is
inconsistent with the persisted
policy, which can be changed after the bucket
is created via the persist()
method.
If storage buckets allowed changing the durability
policy, applications could
switch policies based on dynamically changing conditions. The email client in
our example might want to allow the user to switch between storing email drafts
with "strict"
durability and storing the drafts with "relaxed"
durability.
const draftsBucket = await navigator.storageBuckets.open("drafts", {
durability: "strict", persisted: true });
// Called when the user switches a "drafts" durability toggle.
async function setDraftsDurability(durability) {
// If durability can change, it definitely needs to be exposed using an async
// function.
const currentDurability = await draftsBucket.durability();
if (currentDurability === durability)
return;
await draftsBucket.setDurability(durability);
}
In the world proposed by this explainer, the email client could offer the same flexibility to the user at the cost of extra complexity. The email client would create two buckets, write drafts to the bucket that reflects the user’s current preferences, and read drafts from both buckets.
const draftsBuckets = {};
draftsBuckets.strict = await navigator.storageBuckets.open(
"drafts-durable", { durability: "strict", persisted: true });
draftsBuckets.relaxed = await navigator.storageBuckets.open(
"drafts-fast", { durability: "relaxed", persisted: true });
const draftsDb = {};
for (let durability of ["relaxed", "strict"]) {
draftsDb[durability] = await new Promise(resolve => {
const request = draftsBucket[durability].indexedDB.open("messages");
request.onupgradeneeded = () => { /* migration code */ };
request.onsuccess = () => resolve(request.result);
request.onerror = () => reject(request.error);
});
}
// Called on every draft change, which may happen on every user key stroke or
// mouse click.
async function saveDraft(draft) {
batchedDrafts.push(draft);
const transaction = immediateDraftsDb.transaction("messages", "readwrite");
const messageStore = transaction.objectStore("messages");
await new Promise((resolve, reject) => {
objectStore.put(draft);
transaction.commit();
transaction.oncomplete = resolve;
transaction.onerror = () => reject(transaction.error);
});
}
// Note: Code that reads drafts needs to operate on both databases.
This alternative was rejected because of concerns that it would significantly reduce the degrees of freedom of user agent implementations, which could result in reduced performance for all applications.
For example, let’s consider a user agent that only targets Linux-based
systems and relies on SQLite to implement storage APIs. This user-agent could
implement "strict"
buckets using SQLite databases (or one consolidated
database per origin) with
PRAGMA synchronous set
to FULL
and
PRAGMA journal_mode
set to DELETE
, in order to take advantage of
SQLite’s F2FS fast path.
"relaxed"
buckets use SQLite databases with PRAGMA synchronous set to DELETE
and PRAGMA journal_mode set to WAL
, which would take advantage of
WAL mode.
The example above illustrates that user agents may be able to obtain better
performance if they can place data with different durability
policies in
entirely different underlying stores. The ability to change a bucket’s
durability
policy would significantly undermine this flexibility.
Bucket expiration times are currently represented as the number of milliseconds ellapsed sinced January 1, 1970 UTC. This representation is consistent with the Date.now() JavaScript API.
Other options considered are:
The current design was preferred over alternatives because it met the TAG’s guidelines around time measurement, and it is consistent with the following APIs.
Bucket storage policies are currently accessible using methods that return Promises. This information could have been exposed using (synchronous) properties instead.
if (draftsBucket.persisted !== true) {
showButterBar("Your email drafts may be lost if you run out of disk space");
}
if (draftsBucket.durability !== "strict") {
showButterBar("Your email drafts may be lost if you run out of power");
}
Exposing synchronous access to the persistence policy was rejected due to
complications stemming from the ability to change a bucket’s persistence policy
after the bucket is created. Our options for handling changes would be to
either say that a bucket’s persisted
property reflects the policy at the time
the bucket is opened, or say that the persisted
property magically changes
when the policy changes. Both options seem confusing for developers.
Exposing synchronous access to the durability policy was rejected in the interest of maxmizing the consistency of the API across storage policies.
navigator.storageBuckets.delete()
force-closes the deleted bucket and
immediately starts deleting the data. The promise it returns is resolved when
all the data is deleted.
We could have had delete()
queue a deletion request and return immediately,
like
indexedDB.deleteDatabase.
const inboxCache = await inboxBucket.caches.open("attachments");
await navigator.storageBuckets.delete("inbox");
// Completes successfully.
const attachment = inboxCache.match("/attachments/3");
We could also have had delete()
wait until all the uses of a bucket are done
before returning.
const inboxCache = await inboxBucket.caches.open("attachments");
// Completes after the tab is closed, because inboxCache is currently open.
await navigator.storageBuckets.delete("inbox");
This alternative was rejected because it would break the core use case of multiple user accounts. In this case, the data associated with each account is stored in one (or a few) bucket. Logging out of one account is implemented by deleting the buckets associated with that account. Under this alternative, the application code for logging out would need to coordinate across all tabs that may be using the same account, and close all connections to a bucket while logging out.
const user2Bucket = await navigator.storageBuckets.open(
"userid222222_inbox");
const user2LogoutChannel = new BroadcastChannel("userid222222_logout");
// Attachments for User 2
const user2Attachments = user2Bucket.caches.open("attachments");
user2LogoutChannel.addEventListener("message", async () => {
user2Attachments.close();
}, {once: true});
We consider this alternative to be more confusing for developers than the proposed behavior. The main benefit would be possibly simplifying user agents’ implementation of buckets, by avoiding the need to implement force-closing in storage APIs. We don’t expect to see much simplification in practice, because user agents need to implement force-closing to handle storage corruption.
The integration points listed in this explainer intentionally exclude the Web Storage API. We could have included localStorage on the list of APIs that buckets offer.
const settingsBucket = await navigator.storageBuckets.open("settings");
const emailsPerPage = settingsBucket.localStorage.getItem('emailsPerPage');
This alternative was rejected because localStorage
does not follow
the same durability policies, and developers would need to manage a
separate quota system just for localStorage
.
The DOM storage standard, and by extension localStorage
does not dictate
any durability policy. However, because the standard exposes a synchronous API
to the main thread, it forces some choices on all implementations that want
to give a good user experience. The choices are (1) using a a RAM-backed
cache (so getItem
doesn’t block on disk), and (2) committing writes
asynchronously (so setItem
doesn’t block on disk). Therefore localStorage
does not have any stance on durability, and user agents would not be able to
honor any durability policy set here.
Web developers would need to keep track of separate quota just for localStorage
and its RAM consumption. Supporting multiple localStorage
instances per origins
could also have performance implications. Each bucket could have a smaller quota
than the existing quota limit, however it may be less desirable for web developers
to have less quota in this way.
At a first pass, the synchronous read API of localStorage
seems like it could be
a problem. However in this case, user agents do not need to block the main thread
to read the localStorage
contents. Instead, implementations can read the
localStorage
data asynchronously, while processing
navigator.storageBuckets.open()
.
Many thanks for valuable feedback and advice from:
The Storage Buckets API has been discussed in the following places.