This spec enables directory uploading by allowing a developer to read directory contents (files and sub-directories) asynchronously and be able to identify the directory structure. This specification proposes changes to [[!HTML]] (in particular, additional API surface on HTMLInputElement, along with an additional atribute on the <input ...> element) as well as a new specification called Directory Upload which brings directories to the web.

This spec is currently being proposed within the WICG (Web Incubator Community Group) with the expectation that there may be changes, and that aspects of this proposal will eventually be part of the [[!HTML]] and [[!FileAPI]] standards.

Introduction

Background

Websites are currently able to provide functionality for uploading files by using <input type="file" multiple> and drag and drop. However, there currently is no standard solution to handle directories and cases that involve a mix of files and directories. This spec will provide the necessary mechanisms to enable directory uploading.

Scenarios

The following scenarios are in scope for this spec:

  1. Users are able to select one or more directories using a file dialog
  2. Users are able to select a combination of directories and files using a file dialog
  3. Users are able to select one or more directories via drag and drop
  4. Users are able to select a combination of directories and files via drag and drop

To address scenario 1 and 2, the HTML5 input tag needs a new attribute that can allow a user to select a directory as well as files. In addition, the HTMLInputElement requires a method for retrieving all the files and directories chosen so they can be traversed. For scenario 3 and 4, DataTransfer needs a method for retrieving all the files and directories dropped in so they can be traversed. To enable these scenarios, a Directory interface needs to be defined allowing the following: retrieving a list of its contents, reporting its relative path, and reporting its name.

Model

Directory, File, Root, and Path

In this specification, a directory is a logical organizing storage unit with a distinct name which contains files and/or one or more subdirectory units (which are themselves directories). A directory corresponds to the same concept from the underlying OS filesystem, which is also called a folder on some underlying OS filesystems. A file is durable binary data retrieved from the underlying OS filesystem, and can be programmatically manipulated by web applications as defined in the [[!FileAPI]]. In this specification, files and directories which are selected by the user are represented by a temporary directory tree. Files and directories selected by the user have a position on the temporary directory tree that can be represented by a string, called a path. The path string uses the U+002F SOLIDUS character ("/") to denote directory hierarchy within the temporary directory tree. Files, like directories, have distinct names and paths which identify them on a temporary directory tree.

The top-most containing directory, which contains all directories and files, if any exist, is called the root directory, and has the following properties:

The Temporary Directory Tree

In this specification, a temporary directory tree consists of directories and files, along with the affiliated path from root, for each directory and file that the user has selected from the underlying OS filesystem; each such directory or file must correspond to a node in the temporary directory tree, with the top-most directory being the directory from which the user has made selections of files and directories and an immediate child of the root of the temporary directory tree.

The present working directory is the node on the temporary directory tree that represents the current directory from which an operation is taking place.

A path to a given directory or file from the root directory is said to be a path from root and starts with a leading "/" U+002F SOLIDUS character and ends with the name of the file or directory.

          /**
            path from root of the 'jazz' directory, with 'music' 
            being the directory from which the user has made 
            selections of files and directories.
          **/

          console.log(currentSelection.path);  // output is "/music/genres/jazz"; 

          console.log(currentSelection.name);  // output is "jazz"
        

A child is a node on the temporary directory tree with a position that is below a given position on the temporary directory tree for a given directory; a subdirectory of the present working directory is a child of the present working directory; a subdirectory of that subdirectory is also a child of the present working directory. An immediate child corresponds to a node on the temporary directory tree that is either a subdirectory that is the next position on the path, or a file that is next on the path. A parent is a directory that contains a file or another directory; in particular, the directory that contains the present working directory is its parent. The root directory is the parent of all other nodes in the temporary directory tree.

In this specification, stat info is said to be information that includes any of the following:

When this specification says to add a node n to a sequence s from the temporary directory tree this means to add a reference to n to s along with any stat info.

Returning Files and Directories

When this specification says to return a file or directory sequence promise on a given present working directory, the user agent must run the following steps:

  1. If the result of running is settings object a secure context with the incumbent settings object is Not Secure, return a new promise rejected with a "SecurityError" exception.

  2. Let p be a new promise and s a sequence, initially set to an empty sequence.

  3. Run the following steps asynchronously:

    1. If the present working directory has no files or directories, resolve p with s, which is still an empty sequence.

    2. Otherwise, for each immediate child of the present working directory, add a node to s as a File or Directory object, depending on the type of the immediate child.

    3. If there are any problems retrieving any immediate child, reject p with an InvalidStateError.

      Promise rejection occurs for any problem crawling the present working directory.

    4. Once all the immediate child nodes of the present working directory have been added to s as a File or Directory, resolve p with s.

  4. Return p.

A common method on both the HTMLInputElement and on the Directory interface called getFilesAndDirectories() returns files and subdirectories for a given directory using the steps above. Each method invoking the steps above must specify what to consider as the present working directory.

Returning A Flattened File List

It is often convenient to simply return a flattened list of files from within the temporary directory tree. These can be from within the present working directory alone – that is, only every immediate child of the present working directory that is a File – or from the entirety of the tree, including subdirectories. This can be configured via the recursive flag which can be set to true; if true user agents must recurse through the entire temporary directory tree and return all the files.

When this specification says to return a flattened file sequence promise on a given present working directory, the user agent must run the following steps:

  1. If the result of running is settings object a secure context with the incumbent settings object is Not Secure, return a new promise rejected with a "SecurityError" exception.

  2. Let p be a new promise and s a sequence, initially set to an empty sequence.

  3. Run the following steps asynchronously:

    1. If the recursive flag is set to true, follow the substeps below:

      1. For each immediate child of the present working directory that is also a file, add a node to s as a File object representing that file.

      2. Recurse through all child nodes that are subdirectories of the present working directory. If any of these contain files, add a node to s as a File object representing that file.

      3. If there are problems retrieving any file or adding a node for that file, reject p with an InvalidStateError.

    2. Otherwise, the recursive flag is set to false. For each immediate child of the present working directory that is a file, add a node to s as a File object representing that file. If there are problems retrieving any file or adding a node for that file, reject p with an InvalidStateError.

    3. Resolve p with s.

      If no files are found, p is still resolved with s which is still an empty sequence.

  4. Return p.

Triggering a Directory Picker

This section will eventually be a pull request to the HTML specification.

When an input element's type attribute is in the File Upload state, the rules in this section apply [[!HTML]]. .

User agents may surface a directory picker, which may also serve as a file picker; this is an unspecified piece of user interface that a user agent may deploy upon receiving a click event, and may expand upon any existing user interface already in place for selecting files when the input element's type attribute is in the File Upload state. This generally happens synchronously and is considered a blocking operation. When this specification says to trigger a directory picker a user agent must follow the steps below (and optionally follow the steps labeled "may" [[!RFC2119]]):

  1. If the algorithm is not allowed to show a popup [[!HTML]] then abort these steps without doing anything else.

  2. Return, but continue running these steps in parallel [[!HTML]].

  3. Optionally, wait until any prior execution of this algorithm has terminated.

  4. If the result of running is settings object a secure context with the incumbent settings object is Not Secure, the user agent may prevent a directory picker from being shown at all. In this case, a user agent may run the steps for the File Upload state as if the allowdirs attribute is not set, and exit from this algorithm.

    It might be simpler to specify a Directory Upload state, alongside a File Upload state.
  5. If the user agent is capable of surfacing both a directory picker and a file picker in a single user interface element, this user agent is said to have a Single File and Directory Picker. The HTMLInputElement.isFilesAndDirectoriesSupported attribute of the HTMLInputElement must be set to true.

    User agents in practice may reuse user interface elements from the host OS. If the host OS supports a Single File and Directory Picker a file input with a single button that enables the combined picker may be rendered and may resemble the suggested user interface element below:

    1 file selected.
    Choose file(s)...
    Clicking "Choose file(s)" activates combined file & directory picker.

    Currently, only OSX exposes a unified picker.

  6. If the user agent does not have a Single File and Directory Picker it is said to have Distinct File and Directory Pickers. This user agent must expose a value of false on the HTMLInputElement.isFilesAndDirectoriesSupported attribute of the HTMLInputElement.

    If the user agent has Distinct File and Directory Pickers, possibly reusing user interface elements from the host OS, a file input that enables two separate pickers (file and directory) via two different buttons may be rendered and may resemble the suggested user interface element below:

    Choose files...
    Choose directories...
    Default (no selection)
    2 files selected.
    ×
    After a selection has been made; clearing the field may reset it to the default state.
  7. If the user agent has Distinct File and Directory Pickers and is set to trigger in Directory Picker Only mode, it may only surface a picker that allows for directory picking, or it may surface a picker featuring two separate buttons as in Fig. 2 above. The suggested user interface element below may be used to surface an option to only pick directories:

    Choose directories...
    Clicking "Choose directories" activates a directory picker only.
  8. Wait for the user to have made their selection.

  9. Queue a task [[!HTML]] to first update the element's selected files and directories so that it represents the user's selection as the temporary directory tree, then fire a simple event that bubbles named input at the input element, and finally fire a simple event that bubbles named change at the input element.

Attributes

This section will eventually be a pull request to the HTML specification.

When an input element's type attribute is in the File Upload state, the rules in this section apply [[!HTML]].

When the input element is in the File Upload state, it can represent two distinct types of user selections:

The allowdirs content attribute is a boolean attribute [[!HTML]] that indicates whether a user is allowed the selection of both files and directories; when set, a user agent that supports it must trigger a directory picker.

APIs

Directory Interface

The Directory interface represents a directory on the temporary directory tree. Methods and attributes on a Directory interface must act as if it is the present working directory.

          [Exposed=(Window,Worker)]
          interface Directory {
              readonly    attribute DOMString name;
              readonly    attribute DOMString path;
              Promise<sequence<(File or Directory)>> getFilesAndDirectories();
              Promise<sequence<File>> getFiles(optional boolean recursiveFlag=false);
          };
        
Directory.name

On getting, this attribute must return the name of the present working directory.

We need normative language around characters that aren't allowed in a name; we also need to determine if DOMString is sufficient.

Directory.path

On getting, this attribute must return the path from root of the present working directory, including its name.

          // path of directory jazz
          /music/genres/jazz

          // path of file milesDavis.mp3
          /music/genres/jazz/milesDavis.mp3
        
Directory.getFilesAndDirectories

When the getFilesAndDirectories() method is called, the user agent must return a file or directory sequence promise with this directory as the present working directory.

Directory.getFiles

When the getFiles() method is called, the user agent must act as follows:

  1. If the optional recursiveFlag parameter is set to true, then return a flattened file sequence promise with the recursive flag set to true, using this directory as the present working directory.

  2. Otherwise the recursiveFlag parameter is set to false or is undefined. Return a flattened file sequence promise with the recursive flag set to false, using this directory as the present working directory.

        directory.getFiles(true).then(images => Promise.all(files.map(uploadImage))
          .then(showDone)
          .catch(err => console.error("Uploading failed!", err));
        

File Input

More information about the HTMLInputElement interface can be found in the HTML5 spec.

          partial interface HTMLInputElement {
                          attribute boolean allowdirs;
              readonly    attribute boolean isFilesAndDirectoriesSupported;
              Promise<sequence<(File or Directory)>>getFilesAndDirectories();
              Promise<sequence<(File)>>getFiles(optional boolean recursiveFlag=false);
              void chooseDirectory();
          };
        
HTMLInputElement.allowdirs

On getting, the HTMLInputElement.allowdirs IDL attribute must reflect the content allowdirs attribute, and is true if the content allowdirs attribute is set, and false otherwise.

Adding the multiple attribute does not change the behavior, however, it allows older browsers to make the file input accept multiple files instead of just one since they would not recognize the HTMLInputElement.allowdirs attribute.

If this attribute is set, input.files MUST return null.

Due to this, HTMLInputElement.getFilesAndDirectories must be called to retrieve the chosen files and directories.
          <input type="file" id="fileInput" allowdirs multiple />
            <!-- Supports being able to select multiple files and/or directories -->
            <!-- Adding the multiple attribute is useful because older browsers would still be able -->
            <!-- to accept multiple files even though they do not recognize the allowdirs attribute -->
        

The default click behavior of such a file input MUST trigger the file picker. Calling HTMLInputElement.chooseDirectory on the file input element MUST trigger the directory picker.

HTMLInputElement.isFilesAndDirectoriesSupported

On getting, this attribute must be true if the user agent will expose a Single File and Directory Picker, and false if the user agent will expose Distinct File and Directory Pickers.

HTMLInputElement.getFilesAndDirectories

When the getFilesAndDirectories() method is called, the user agent must run the steps below:

  1. Let d be the immediate child of the root of the temporary directory tree representing the directory from which the user has made selections of files and directories. Set d to be the present working directory.

  2. Return a file or directory sequence promise on d.

    In order to retrieve all selections reliably, developers should call this method after the change event on HTMLInputElement has fired. Guidance in developer documentation here would be helpful.
          <input id="fileInput" type="file" multiple directory
            ondragover="alertLength(event);"
            onchange="handleChangeEvent(event);">

          <script type="text/javascript">
            function handleChangeEvent(e) {
              // change event fires on input element
              var input = document.getElementById('fileInput');

              if ('getFilesAndDirectories' in input) {
                input.getFilesAndDirectories().then(function(filesAndDirs) {
                  // iterate through each item
                  // see drag and drop example below for more details
                });
              }
            }
          </script>
        
HTMLInputElement.getFiles

When the getFiles() method is called, the user agent must run the steps below:

  1. Let d be the immediate child of the root of the temporary directory tree representing the directory from which the user has made selections of files and directories. Set d to be the present working directory.

  2. If the optional recursiveFlag parameter is set to true, then return a flattened file sequence promise with the recursive flag set to true, using d as the present working directory.

  3. Otherwise the recursiveFlag parameter is set to false or is undefined. Return a flattened file sequence promise with the recursive flag set to false, using d as the present working directory.

HTMLInputElement.chooseDirectory

When the chooseDirectory() method is called, a user agent must trigger a directory picker in Directory Picker Only Model.

Form Submission

When a form is submitted with enctype="multipart/form-data" and a file input with the HTMLInputElement.allowdirs attribute (and a directory named "docs" was picked) it MUST adhere to the following pattern in its request payload:

TODO: generate form submission output from temporary directory tree.
            ------Boundary
            Content-Disposition: form-data; name="file"; filename="/docs/1.txt"
            Content-Type: text/plain

            [DATA]
            ------Boundary
            Content-Disposition: form-data; name="file"; filename="/docs/path/2.txt"
            Content-Type: text/plain

            [DATA]
            ------Boundary
            Content-Disposition: form-data; name="file"; filename="/docs/path/to/3.txt"
            Content-Type: text/plain

            [DATA]
            ------Boundary--
          

Directories that are empty MUST NOT be included in the request payload.

This will make it backwards compatible with server scripts that only expect files.

Drag and Drop

More information about the DataTransfer interface can be found in the HTML5 spec.

DataTransfer's files attribute MUST continue to work with no changes to its behavior.

          partial interface DataTransfer {
              Promise<sequence<(File or Directory)>> getFilesAndDirectories();
              Promise<sequence<File>> getFiles(optional boolean recursiveFlag=false);
          };
        
DataTransfer.getFilesAndDirectories

When the getFilesAndDirectories() method is called, the user agent must run the steps below:

  1. Let d be the immediate child of the root of the temporary directory tree. Set d to be the present working directory.

  2. Return a file or directory sequence promise on d.

          document.getElementById('dropDiv').addEventListener('drop', function(e) {
              e.stopPropagation();
              e.preventDefault();

              var uploadFile = function(file, path) {
                  // handle file uploading
              };

              var iterateFilesAndDirs = function(filesAndDirs, path) {
                  for (var i = 0; i < filesAndDirs.length; i++) {
                      if (typeof filesAndDirs[i].getFilesAndDirectories === 'function') {
                          var path = filesAndDirs[i].path;

                          // this recursion enables deep traversal of directories
                          filesAndDirs[i].getFilesAndDirectories().then(function(subFilesAndDirs) {
                              // iterate through files and directories in sub-directory
                              iterateFilesAndDirs(subFilesAndDirs, path);
                          });
                      } else {
                          uploadFile(filesAndDirs[i], path);
                      }
                  }
              };

              // begin by traversing the chosen files and directories
              if ('getFilesAndDirectories' in e.dataTransfer) {
                  e.dataTransfer.getFilesAndDirectories().then(function(filesAndDirs) {
                      iterateFilesAndDirs(filesAndDirs, '/');
                  });
              }
          });
        
DataTransfer.getFiles

When the getFiles() method is called, the user agent must run the steps below:

  1. Let d be the immediate child of the root of the temporary directory tree representing the directory from which the user has made selections of files and directories and which the user has dragged and dropped. Set d to be the present working directory.

  2. If the optional recursiveFlag parameter is set to true, then return a flattened file sequence promise with the recursive flag set to true, using d as the present working directory.

  3. Otherwise the recursiveFlag parameter is set to false or is undefined. Return a flattened file sequence promise with the recursive flag set to false, using d as the present working directory.

Appendix

FAQ

  1. Why did we use getFilesAndDirectories() instead of enumerate()?
    • We want to be able to save the enumerate() name for the future when we have Observables available so that we can have enumerate() return an Observable.
  2. Why are we returning Promise for getFilesAndDirectories() instead of an Observable?
    • The directory upload scenario does not require listening on changes to the file system, therefore Observables would not be warranted.
    • Observables, though undoubtedly advantageous in the enumeration scenario, are not well defined yet and it may take a long time before they are.
    • We want to build a solution using today's primitives (such as Promises) and then, if necessary, we can retrofit the API for future primitives (such as Observables) when they are available and well defined.
  3. Why are we returning an Array in the Promise?
    • Provides a snapshot of the directories and files at the time that getFilesAndDirectories() is called. Developers would expect this for directory upload.
    • An array allows iterating through the File and Directory objects.
  4. Why did we remove enumerateDeep() from the originally proposed Directory interface?
    • It is not needed for the time being as recursively calling the getFilesAndDirectories() function would fulfill the scenario that enumerateDeep() was meant to fulfill.
    • getFilesAndDirectories() would provide a smaller result set compared to enumerateDeep().
  5. Why is it ok to remove the other methods from the originally proposed Directory interface?
    • Since the focus is on read-only scenarios to enable directory uploading, only a subset of the APIs are needed.
    • The Directory interface will be extensible for the future when we need to address scenarios that involve writing.