ArchiveAPI – read out archive file contents + Introducing Bleeding Edge

Working with files on the web has been a challenge for a long time, and when things like various File APIs have surfaced, it has made me really happy! Now on to the latest edition: ArchiveAPI, giving the ability to work with archive files.

Introducing Bleeding Edge

Before I start talking about the ArchiveAPI, I wanted to introduce the new Bleeding Edge category here on Mozilla Hacks. What this means is that we cover technologies/features/APIs that haven’t been released in an official version of Firefox, or any other web browser, yet. It covers things that, in most cases, have their first initial implementation in Firefox Nightly or Firefox Aurora.

The goal is to have these features shipped, but given feedback from you, dear developers, or any other findings during this phase might result in changes to them before they are released.

So, feel more than welcome to try them out! Either you have a headstart on knowing what’s coming, or you have the possibility of affecting the future of features for developers on the web. Win-win! :-)

What it is

As part of our WebAPI initiative at Mozilla to make the web an even more powerful platform, I was lucky to talk to Andrea Marchesini, the Lead Developer of the ArchiveAPI. Basically, it allows you to read the content of archive files, e.g. ZIP files, directly in web browsers.

Basically, we have an ArchiveReader object and when it successfully manages to read the contents of an archive file, we can iterate over them, read out file data, show previews of each file’s content etc.

var archiveFile = new ArchiveReader(archiveFileReference),
    fileNames = archiveFile.getFilenames();

When you trigger an action/method on this file, like getFilenames, you will have two handlers: onsuccess and onerror. Like this:

fileNames.onerror = function () {
    console.log("Error reading filenames");
};

fileNames.onsuccess = function (request) {
    // Get list of files in the archive
    var result = this.result;

    // Iterate over those files
    for (var i=0, l=result.length; i<l; i++) {
        file = archiveFile.getFile(result[i]);

        file.onerror = function () {
            console.log("Error accessing file");
        };

        file.onsuccess = function () {
            // Read out data for that file, such as name, type and size
            var currentFile = this.result;
            console.log(currentFile.name);
        }
    }
}

Demo

I’ve put together an ArchiveAPI demo where you can upload an archive file and for any of the image or text files within that archive, it will directly generate a preview in the web page. The code is available on GitHub, as part of our mozhacks GitHub repositiory.

Note: currently this demo only works in Firefox Aurora and Firefox Nightly.

I’ve also put together a screencast of this demo at YouTube:

(If you’ve opted in to HTML5 video on YouTube you will get that, otherwise it will fallback to Flash)

Feedback

I hope you find this interesting, and one more step forward for the web as a platform! Please let us know what you think in a comment here.

Additionally, there’s a poll/questionnaire you can take with regards to the asynchronous nature of ArchiveAPI. It’s available in Feedback on potential ArchiveReader APIs.

About Robert Nyman [Editor emeritus]

Technical Evangelist & Editor of Mozilla Hacks. Gives talks & blogs about HTML5, JavaScript & the Open Web. Robert is a strong believer in HTML5 and the Open Web and has been working since 1999 with Front End development for the web - in Sweden and in New York City. He regularly also blogs at http://robertnyman.com and loves to travel and meet people.

More articles by Robert Nyman [Editor emeritus]…


16 comments

  1. Andrei

    What kind of files it can handle right now? zip, tar, bz2 ?

    October 1st, 2012 at 12:33

    1. Robert Nyman

      The general types, like the one you mention, should be supported. Also, ZIP in ZIP. Please try it out, and if you encounter issues, let us know!

      October 1st, 2012 at 13:22

  2. Beben Koben

    hmmm…
    for aurora and nightly
    ic ic
    wait for firefox :D

    October 1st, 2012 at 13:34

    1. Robert Nyman

      That’s ok. :-) You have options to try it if you want to.

      October 2nd, 2012 at 02:26

  3. Developer

    I think that HTML 5 implementation is more important than introducing new features like ArchiveAPI or Gamepad API. Most of interactive elements (command, menu, details) and form controls (range, number, date and time) are not implemented. Encapsulation methods (such as JavaScript modules, Web Components, scoped style) are not available. Building application layouts is complicated beacause new CSS flexbox and CSS grid are not implemented. So in my opinion foundations of web based applications are weak. It means that writing web application in HTML 5 still consume much more time and is more difficult than plugin-based alternatives.

    October 1st, 2012 at 17:38

    1. Robert Nyman

      That assumes that one thing is being done at the cost of another, which isn’t necessarily true. Web browsers consist of many different parts that together complement each other in offering the best experience.

      For instance, when it comes to form controls and CSS layout options, the specifications are in a different state and a lot of those things are being discussed.

      This is parallel work, not competing.

      October 2nd, 2012 at 02:28

  4. Calvin Spealman (@ironfroggy)

    Not really happy with this API. It seems like it should be closer to or actually integrated with the Filesystem API, not separate as it seems to be.

    October 1st, 2012 at 19:09

    1. Robert Nyman

      I would argue that it is fairly similar to the FileSystem API, but we are always open for improvements. Please submit your thoughts in the feedback link provided in the blog post.

      October 2nd, 2012 at 02:29

  5. Forrest O.

    Will we see an ArchiveWriter API?

    October 2nd, 2012 at 06:36

    1. Robert Nyman

      Right now, I don’t know of any plans. It does seem plausible, though.

      October 2nd, 2012 at 06:39

  6. Ken Saunders

    Hey this is very cool!

    Y’all should make this an add-on to potentially increase the amount of users testing it out. I would love to see hacks create add-ons like mozilla labs.

    I tried it with several zips containing wallpapers, various images and graphics, and a few Firefox add-ons.

    Perhaps saying that it displays (certain) file contents would be more appropriate than calling them previews since it shows all images at full sizes that can be saved and the text files can be copied.

    I already see a one great use for this.
    I and millions of other wallpaper and graphic designers, stock providers, etc, offer our art in multiple sizes to accommodate different monitor resolutions and for other reasons, and this would be a nice feature for users since they wouldn’t have to download an entire zip, open it, choose one piece that they want and delete the rest. In the case of wallpapers, they can use this to choose one image right away and set it as their desktop wallpaper (although the quality of the image gets trashed a bit for some reason). For other graphics, same idea. Choose one or two to save and forget about the rest.

    Maybe sites like deviantART.com could implement this.

    Will Windows icon (.ico) files eventually be supported?

    October 2nd, 2012 at 18:44

    1. Robert Nyman

      Glad you like it!

      In general, installing add-ons an entirely different approach for APIs, and since we need feedback on how the feature actually works in Firefox, it’s much much better (and hopefully easier) for people to test in Firefox Aurora or Firefox Nightly.

      Good to hear about the suggested use case, and I agree!

      Windows icons are supported in the ZIP file, but for previewing, it’s based on what you can natively display in a web page.

      October 3rd, 2012 at 03:56

  7. Jason

    Hold on. Why is this being incorporated in to a standard?
    This provides nothing that can not be done in standard well written JavaScript (Look Mozilla’s own pdf.js and many other projects out there).

    If we start providing ‘libraries’ like this, every single file format out there will need it’s own ‘standardized’ native version. It will be a nightmare.

    October 4th, 2012 at 19:17

    1. Robert Nyman

      There is currently no other support for reading out file information from any kind of archive file. And it is about an additional API to offer that possibility, not one for every file format but rather for a certain type of format.

      October 5th, 2012 at 02:18

      1. Jason

        > And it is about an additional API to offer that possibility, not one for every file format but rather for a certain type of format.

        In my opinion, that’s the exact problem.
        Currently, there are no specs at all for reading certain types of files anywhere.
        If all of the sudden there was a new spec or API added to read a particular type of file, the rest of the world will start wanting more and more file types, and then people, while under that pressure, will be forced to add native support for those file types and there will be a million different readers for a million different file types polluting the namespaces in the browser.

        > There is currently no other support for reading out file information from any kind of archive file.

        I would say there is no ‘native’ support. It’s quite possible to make parsers that perform quite decently in pure JavaScript.

        * http://mozilla.github.com/pdf.js/
        * https://github.com/imaya/zlib.js
        * Well, a quick search will find you lots more.

        October 6th, 2012 at 07:58

        1. Robert Nyman

          Well, through the multiple File API specifications, there are different ways of reading out file contents in:

          – FileList
          – Blob
          – File
          – FileReader
          – URI scheme

          So when it comes to the ArchiveAPI, it could either be a part of that FileAPI family, or perhaps incorporated into one of the existing ones. When this becomes suggested as a standard, the different options will be discussed.

          I think things like pdf.js are great, but not really the same thing. They are more full-fledged products for solving something and require maintenance and follow-up.

          I rather see these various standardized APIs (currently or soon-to-be) as complements to such products, and something web developers know they can rely on being there natively in web browsers.

          October 8th, 2012 at 02:56

Comments are closed for this article.