The State of HTML5 Local Data Storage
posted by joshua on May 10th, 2010
Of all the new features being implemented as HTML5, I think I’m most excited about offline storage. Despite more and more ubiquitous Wifi, despite the ability to tether our laptops to our 3G mobile phones, as web apps have become more and more sophisticated, the need to be online has often felt like the the last barrier separating the web and the desktop.
There are actually two kinds of offline storage on offer in HTML5. Client-side session and persistent storage (also sometimes referred to with the vaguely misleading “DOM Storage”, or simply “web storage”) is a simple, cookie-like key-value store. The key and value are treated as strings, but it’s possible to store more complex objects as stringified JSON.
If you’ve worked with cookies before, you’ll probably find yourself getting familiar with this storage mechanism pretty quickly. However, there are some significant differences between web storage and cookies:
- The browser sends all relevant cookies in the headers of every request, but web storage is held entirely by the browser, until explicitly sent somewhere by client-side scripting.
- Cookies have a built-in expiration mechanism, but data in web storage has no expiration – it will remain through page refreshes, browser restarts, OS reboots, etc. until explicitly deleted.
- While all relevant cookies are exposed to javascript through the cookies object, allowing scripts to “walk” the entire collection of cookies, there’s no such mechanism with web storage – if you don’t know the key, you can’t get the data.
- Note that unlike cookies, which can be restricted by server domain as well as path, web storage can only be restricted by domain. This can be a security issue for multi-owner sites that don’t use subdomains. It’s also possible to store data with no domain restrictions, making it available on any domain (but you still have to know the key).
The other kind of offline storage – variously called “JavaScript Database” (Webkit), “Storage” (Mozilla), Web SQL storage or “webdb” (various) – is far more robust. This is offline storage using real SQL – Webkit and Mozilla both use an embedded SQLite engine, and expose it through various client-side scripting interfaces.
Here’s an example taken from Apple’s developer documentation for Safari:
Unfortunately, unlike “DOM Storage”, the various browser vendors are less committed to standardizing on “webdb” – Opera’s support is still forthcoming, Webkit is fully on-board and shipping, and Mozilla has been vocally skeptical of the whole idea, and labels their API as ‘unfrozen’ – meaning it’s likely to change over time. There’s excellent documentation from both Webkit and Mozilla, but the APIs are drastically different.
Even worse than two drastically different APIs for the two major supporting browsers, the larger developer community hasn’t quite bought into SQL storage, yet. There are concerns that SQLite isn’t the most standards-compatible SQL implementation. Some had hoped to see the browser vendors adopt one of the modern, flexible, “document model” database formats, made popular by CouchDB, MongoDB and SimpleDB. For now, the W3C Working Group has officially declared SQL storage as being “at an impasse”.
In the mean-time, there are some interesting alternatives. If SQL storage sounds suspiciously familiar, it might be because it’s largely based on Google’s browser plugin Gears. Gears has some issues – like the need to handle cases when it’s not installed, or how to “encourage” users to install extra software (Gears is built-in to Chrome, but an additional install on other browsers), but it at least provides a consistent storage API across multiple browsers, and the additional Gears functions are pure gravy (background threads, desktop integration). Unfortunately, it seems development of Gears has stalled, as developer attention has shifted, with focus now on providing native support for these features in Chrome.
Another stop-gap solution is Paul Duncan’s PersistJS, which layers an abstract API interface on top of a variety of browser storage backends. PersistJS uses HTML5 native storage by default, Gears when available, and can fall back to Flash, and userdata behaviors for older versions of IE. (I should also mention that Dojo Storage has similar goals, but PersistJS seemed to cover more browsers, and has a smaller footprint.) Unfortunately, the cost of cross-browser compatibility is that PersistJS’s interface resembles the simple key-value storage you get with DOM Storage. While, as with DOM Storage, it’s possible to store serialized JSON in PersistJS (see examples here), some applications will suffer from poor support for more complicated data relationships.
I’ve been disappointed to discover that none of these implementation options provides any mechanism to support syncing data once the browser is back online! In today’s extremely social environment, the kinds of apps I imagine building with offline storage would need some mechanism to pull user data up to the server when the user is back online, and push any new data out to the browser. (Something like Caleb Crane’s Impel but without the baggage.)
I’m considering writing a javascript library that would layer jLinq on top of an implementation-obscuring storage API, but I suspect I’ll wait until Mozilla’s API reaches “frozen” status. I’d also want a library that would provide some kind of basic support for syncing, maybe something based on Thoughtbot’s Jester.


May 10th, 2010 at 02:16 PM
To make this picture complete. Both Mozilla and Microsoft (think 80 % market share) have said that IndexedDB is the technology they will pursue for the future: http://www.w3.org/TR/IndexedDB/ and the SQL-based has been declared to be at an impasse by the spec author.
May 10th, 2010 at 02:23 PM
Web SQL Storage is a terrible idea and the spec is unlikely to move forward any longer.
The problem is that it just standardizes a subset of SQL that is most likely impossible to implement without a SQLite backend. Requiring SQLite is a non-starter for some implementations and it’s very unclear how solid the spec is since everyone is leverage the same underlying implementation. From the spec “This specification has reached an impasse: all interested implementors have used the same SQL backend (Sqlite), but we need multiple independent implementations to proceed along a standardisation path. Until another implementor is interested in implementing this spec, the description of the SQL dialect has been left as simply a reference to Sqlite, which isn’t acceptable for a standard.”
There is another spec that looks much more promising called Indexed Database API which is simple index structures on top of the existing storage API. If this were widely accepted it wouldn’t be very difficult for someone to build a SQL engine on top of it, or a CouchDB API.
http://www.w3.org/TR/IndexedDB/
May 10th, 2010 at 02:27 PM
Lars, if the main point here is mobility, you’re looking at the wrong market when calculating share. WebKit and Opera are pretty much the only viable players right now.
May 10th, 2010 at 02:35 PM
LINQ style syntax and JSON style tree structures work extremely well and, to me, seem to be easy for other developers to learn. I certainly hope they don’t stick with a simple key value pair. A light, schemaless, mongo or redis like storage block is where my vote is.
May 10th, 2010 at 02:36 PM
Maybe we need (again) a wrapper around the various implementations.
I’m using such a wrapper called Lawnchair and have had success in the limited cases where I needed it:
May 10th, 2010 at 03:19 PM
Mozilla only offers sql storage to extensions, this is not available for normal web pages.
May 10th, 2010 at 09:45 PM
1 for DaveStaunton: offline storage is really only interesting for mobile, since we can assume that almost all desktops are persistently connected. And by the end of the year, webkit mobile browsers will have 95% of data traffic.
May 10th, 2010 at 11:23 PM
Michael, I respectfully disagree! Who are you to say that offline storage is only interesting for mobile? If there was a consistent API that worked on the majority of desktop browsers, you can bet your ass that people would come up with inventive things to do with it.
Just look at the difference between Gmail with and without Gears. That’s just the beginning!
Remember: everyone hated the Facebook newsfeed the day it came out.
May 11th, 2010 at 02:26 AM
Opera actually shipped Web SQL Database (as it’s called) starting with Opera 10.50.
May 11th, 2010 at 03:44 AM
Joshua,
good article! Some notes on it, though:
First: “if you don’t know the key, you can’t get the data.” Well, one should know the key, I guess – but you can still iterate over all keys in localStorage, using it’s length property and the key() method, which accepts a numerical index.
I think the key/value store is a very robust backend for all kinds of data tasks (check out lawnchair – Brina Leroux built a nice document store on top of it). And, it’s implemented in all major browsers today, even IE. Considering multi-owner sites: you could go and namespace your data, like the approach Dojo takes. Or, you could encrypt your data, which makes you also safe against DNS spoofing attacks. All this, plus the very easy to use API makes localStorage a better candidate (in my opinion) for most client-side persistant data storage demands than webKitSqlite, WebSimpleDB or the likes.
Oh, and I strongly agree with Pete – client-side persistant storage is of major importance for desktop browsers, too!
May 11th, 2010 at 04:30 PM
I think simple key/value storage is pretty much covered by localStorage. The app I’m building had cookie-itis. We replaced a few dozen cookies with one localStorage implementation (using PersistJS with userData fallback for IE6/7). Works really well. Moving forward, the feature ideas that I have involving client-side storage all require transactional logic and queries. I understand that it’s a pain to standardize on a SQL API, but sooner or later, somebody’s going to have to bite that bullet.
August 8th, 2010 at 11:10 PM
Hi all, So if I read between the lines correctly, I gather that PersistJS is one of the better ‘fallback to something’ implementations of client storage. I had messed around with the Dojo Storage stuff for a while, but sort of gave up on it. Every time I upgraded my browsers, things no longer seemed to work. And that’s not to say that they worked perfectly to begin with, for every browser.
I don’t blame the Dojo code, since it would seem to be the browsers changing their implementations, and not poorly coded dojo modules. If anyone knows of other PersistJS like code frameworks for client side storage, please post. Seems like history repeats itself when browser’s don’t play nice and a good idea can’t get the needed momentum to become a de facto standard.
Great article. Can’t thank you enough for the ‘on point’ discussion.
August 11th, 2010 at 06:10 AM
“Persevere” looks like the most fully-featured offline-accessible, locally-storedable, server-synchable, real-time-comet-updateable, RESTful, JSON database solution so far.
It’s even got JSONPath/JSONQuery support, so you don’t have to manually comb through stuff. Plus, JSONSchema, for an Object model of your data.
SO COOL.