New API for directory picking and drag-and-drop

September 14, 2015 4 comments

As part of Mozilla's effort to reduce the Web's dependency on Flash we have recently been working on a Microsoft proposal called Directory Upload to provide directory picking and directory drag-and-drop. (This proposal is a simplification of part of the FileSystem API draft, a much more comprehensive set of filesystem APIs.) After providing several rounds of feedback to the Microsoft guys, the Directory Upload proposal has been made available for wider feedback, and we have an implementation enabled in Nightly builds that developers can play with and file bugs on.

Why not copy Chrome's directory upload behavior?

For those that are aware of it, the obvious question will be why didn't we just standardize the behavior introduced with the webkitdirectory attribute for the <input> element (introduced in Chrome 11, but not available in Safari)? Standardizing that behavior would make it easier for content authors to update existing content that currently uses webkitdirectory to also support non-Chrome browsers in the future.

With webkitdirectory, when a user picks a directory, all the files under the entire expanded directory tree are added as File objects to one big, flat FileList, which is then set on the <input> element. So it's only after the entire directory tree has been traversed that the change event can be fired to notify the page that the user made a selection and files are available. The advantage of this flat-list approach is that older scripts/libraries that expect a flat list from <input> can more easily be made to work (for example, by adding some awareness of webkitRelativePath). The big downside is that if the number of files in the expanded directory tree is large, then even on machines where I/O is fast it will be a long time before the page is notified via the change event that the user picked something (and that's if it doesn't hang first due to running out of memory). Until the change event notifies it, the page can't even acknowledge that it knows a user selection is incoming, and as a result the user experience can be very poor if I/O is slow or a relatively large directory tree is picked.

The Directory Upload proposal's behavior

The main difference between the webkitdirectory behavior and the Directory Upload proposal is that, instead of populating HTMLInputElement.files, the latter provides a Promise that is fulfilled by an array that contains only the immediate Directory/File objects that the user actually selected. This allows a page to start processing a user's selection and provide feedback to the user much sooner. Each Directory's direct contents are then only accessed, on demand, via another Promise returning method, allowing the page to incrementally walk the directory tree processing and providing feedback to the user as it goes.

The text of the proposal is short and hopefully readable so I'd encourage anyone who is interested to take a look, but in summary the main API currently looks like this:

partial interface HTMLInputElement {
           attribute boolean directory;
  readonly attribute boolean isFilesAndDirectoriesSupported;
  Promise<sequence<(File or Directory)>> getFilesAndDirectories ();
  void                                   chooseDirectory ();
};

partial interface Directory {
  readonly attribute DOMString name;
  readonly attribute DOMString path;
  Promise<sequence<(File or Directory)>> getFilesAndDirectories ();
};

For those interested in playing with the implementation in Nightly builds, I have a hacked up demo/test page that uses both the directory picker and drag-and-drop API. For those that are interested in using the API in Chrome, you might find the polyfill that the MS guys wrote useful (Chrome won't actually benefit from the incremental tree traversal of course, so picking large directory trees in Chrome is still likely to hang the page).

(If you do experiment with the API, one thing to note is that the Promise returned by HTMLInputElement.getFilesAndDirectories changes every time the directory picker is used, so you need to call getFilesAndDirectories after the change event has fired rather than calling it speculatively in advance.)

Why not put Directory objects in HTMLInputElement.files?

Another question some people may ask is why we have the HTMLInputElement.getFilesAndDirectories method when we could put Directory objects in the HTMLInputElement.files FileList for any directories that are directly selected by the user. Again, this is mainly about allowing the change event to fire as soon as possible so a page can acknowledge awareness of a user's action promptly. Typically the OS native file/directory pickers that browsers use will notify the browser when a user has picked something by sending/making available a list of paths. However, if the application is going to provide the page with the picked files/directories via HTMLInputElement.files then it is best not to fire the change event at that point. This is because script may iterate over the FileList accessing properties that may require I/O, such as File.size. To avoid blocking on synchronous I/O when script accesses these properties implementations need to look up and cache that information before firing the change event.

By requiring HTMLInputElement.files to be null when a directory picker is being used, the Directory Upload proposal allows the change event to be fired as soon as the OS native picker provides the list of paths, rather than waiting until the list of File/Directory objects has been created. All the I/O required to create the File/Directory objects needed to resolve the Promise can happen asynchronously after the 'change' event has fired.

Uploading directories

While the current Directory Upload proposal requires implementations to submit the files under a picked directory if its <input> is in a form that is submitted, wrapping <input> with a <form> that the user may submit is likely to be bad practice in general. When a user picks a directory it is much more likely that the number of files to be uploaded will be large. As a result users will be more likely to find a submission taking too long and, if the user doesn't abort the submission, it's more likely that server limits such as Apache's max_file_uploads configuration option will be hit and files will fail to upload.

In most cases authors should incrementally walk the directory tree and use XMLHttpRequest (or a JS library that wraps it) to upload files individually or in small batches.

Future changes

One of the reasons that we didn't just implement the relevant parts of the FileSystem API draft for providing directory picking/drag-and-drop is that the API described there depends on Observable, which is sort of like a Promise but allows a collection of results to arrive bit by bit as they become available. (Actually the FileSystem API should probably change to use AsyncIterator, which is similar to Observable but allow script to pull results bit by bit at its own pace rather than having them pushed on it as soon as they're available.) Returning an Observable (or AsyncIterator) instead of a Promise would allow the immediate contents of a Directory to be accessed in small batches, which would further improve user experience when I/O is slow or a directory has a large number of direct contents. Unfortunately the discussion for standardizing Observable and/or AsyncIterator doesn't look like it will reach a conclusion any time soon. Providing a Promise returning API now allows the Web to progress, but at some point in the future we may end up adding API like the enumerate and enumerateDeep methods described in the FileSystem API draft.

When will this ship? Where should I send feedback?

When we ship will depend on the feedback that we get from users and other implementers; please experiment with our implementation and see how well the proposal works for you. Since the proposal is currently in the Web Platform Incubator Community Group feedback can either be sent to the WICG mailing list or filed as an issue against the proposal in the Directory Upload proposal's github repository. There's also a thread on the public-webapps list that you can respond to.

If you're testing Mozilla's implementation, the main bugzilla bug tracking when we will ship this feature is bug 1188880. Either file bugs that block that one, or else comment in that bug.

Tags: Mozilla