Service Workers and Cache Storage in Thimble
tldr; During the spring and summer, I've been working on rewriting Thimble to use Service Workers and CacheStorage
. Now that we've shipped it, I wanted to talk about how it works and what I learned.
On Twitter a few weeks back, I was reminiscing with Dave Herman and Brendan about web standards we worked on in Firefox a decade ago. Dave made the point that the 2017 web is still exciting and accelerating, especially with things like Service Workers. I couldn't agree more.
I've been waiting to be able to do what Service Workers provide forever. For years we heard about all the problems that they would solve, and I was a believer from day one. Imagine having a web server running in your browser, and able to serve any content type? Sign me up!
Now that they're here (Chrome and Firefox), most of what I see written relates to using Service Workers in order to build Progressive Web Apps (PWA). PWAs are great, but there are other things you can do with Service Workers. For example, you can use them to serve user generated content on the fly without needing a server.
Four years ago I started working on this problem. The goal was to be able to allow users to create arbitrary web sites/apps in a full editor, and then run them without needing a server. In a classroom setting, the network is often non-functional and I wanted to make it possible to just use a browser to do everything. I wanted a static web page to provide the following:
- filesystem (we wrote a browser version of node's
fs
module called Filer) - full editor (we forked Adobe's Brackets editor and made it run in the browser)
- "web server" to host static content from the filesystem
- "web browser" to run the content (an
iframe
with apostMessage
API)
My first attempt was done before Service Workers were readily available, and didn't use them. Also, because I wanted to support as many browsers as possible, I needed a "progressive" solution for providing these features: IE11 will never have Service Workers, for example, while Safari probably will (see how optimistic I am?).
This initial version made heavy use of Blob
, URL.createObjectURL()
, and DOMParser
. The idea was to leverage the fact that all filesystem operations are asynchronous, which provides an opportunity to also create and cache Blob URLs for files, mapped to filenames. When a user tries to browse to index.html
, we instead serve the cached Blob URL for that file, and the browser does the rest. For content within an HTML or CSS file, we simply rewrote all the relative links to use Blob URLs, such that image/background.png
becomes blob:https%3A//mozillathimblelivepreview.net/264a3524-5316-47e5-a835-451e78247678
.
It worked great for many years as part of Thimble, allowing students and teachers to learn web programming without having to do any installation or setup.
However, I was never satisfied with this approach, especially since I knew Service Workers could solve one or our largest problems: we couldn't reliably rewrite filenames to Blob URLs in dynamic JavaScript. Over and over again users would file bugs that would say, "I tried to use JS to load an image, and it didn't work!" Not being able to fully support JS at the network level was no good, and it frustrated me to not fully support all of the web.
So I slowly worked away at adding a secondary URL cache implementation based on Service Workers. My idea was to use the same pathways that we already had for writing Blob URLs for file content, but instead of rewriting links, we'd just let a Service Worker serve the content from our filesystem instead of the network.
There was one problem for which I couldn't see an elegant solution. I needed a way for the editor to share the content with the Service Worker, since we don't always serve pages from disk (e.g., IndexedDB): if a user is editing a file, we serve an instrumented version instead, which does things like live highlighting, hot reloads, etc. It's possible for a user to have many Thimble tabs open at once, but there will only be one Service Worker. I didn't want the Service Worker to have to poll every open Thimble window before it found the right content.
The solution turned out to be really easy. Instead of somehow posting data between the DOM windows and the Service Worker, I could instead use storage shared between both in the form of CacheStorage
. Whoever had the brilliant insight to make this available to both window
and the Service Worker
, I salute your forward thinking!
Using CacheStorage
I was now able to create and cache URLs in the editor's DOM every time content changed on disk, or whenever the live version of a resource was updated. Then, the browser could request the resource and the Service Worker could fulfill the request using the cached URL, without needing to know where it came from in the first place. It's beautifully simple and works flawlessly.
For example, here's the React version of the TodoMVC app running unmodified in Thimble, which uses lots of dynamic JS:
And here's a great HTML5 Game Workshop from Mozilla Hacks, which uses JS, audio, and dynamic images/sprites:
At a high level, the code on the editor/filesystem side looks like this:
var data; // Typed Array of bytes (PNG, CSS file, etc)
var type; // some content type, maybe 'application/octet-binary'
var blob = new Blob([data], {type: type});
var response = new Response(blob, {
status: 200,
statusText: "Served from Offline Cache"
});
var headers = new Headers();
headers.append("Content-Type", type);
var url; // How you want to address this, any valid URL.
var request = new Request(url, {
method: "GET",
headers: headers
});
var cacheName; // Some unique name for this cache, maybe a URL
window.caches
.open(cacheName)
.then(function(cache) {
cache.put(request, response);
})
.catch(callback);
};
Inside the Service Worker, I do something like this, looking for the requested URL in the cache, and serving it if possible:
self.addEventListener("fetch", function(event) {
event.respondWith(
caches.match(url)
.then(function(response) {
// Either we have this file's response cached, or we should go to the network
return response || fetch(event.request);
})
.catch(function(err) {
return fetch(event.request);
})
);
});
HTML, CSS, JavaScript all work now, and so does audio, video, JSON, and any new thing that gets added to the web next year. Everything just works, thanks to Service Workers.
We quietly enabled this new cache layer on Thimble a month ago, and it's been working well. We do feature detection on Service Workers and Cache Storage, and browsers that support them get the new cache implementation, while browsers that don't continue to use or old Blob URL based layer.
After so many years of workarounds and complex hacks to make this work, I was amazed at how little code it took to get this working with the Service Worker and Cache Storage APIs. This was my first major app that used the new Service Worker stuff, and I found it took me a long time to adjust my mental model for web dev (the Service Worker lifecycle is complex and sometimes surprising), and my debugging workflow; the final code isn't hard, but the adjustment in how you build and test it is fairly significant. I've also found that it's been somewhat frustrating for my teammates who don't know as much about Service Workers, and run into various cache-related bugs that can be annoying.
My experience is that Service Workers do indeed live up to the hype I heard a decade ago, and I'd highly recommend you consider using them in your own apps. Don't get hung up on thinking of them only as a way to cache resources coming from the server as you load your pages. Instead, you should also consider them in cases where your user or client app generates content on the fly. You can serve this dynamic content alongside your hosted content, blurring the line between client/server. There's a lot of interesting problems you can solve this way, and all of it can be done with or without network.
Also, if you're teaching or learning web dev, please give Thimble a try. We've been adding a lot of cool features beyond just this architectural change I've described, and we think it's one of the best online coding tools available right now.