Workbox
I want to read about workbox because I finally want to add a service worker / push notifications to this site.
References
Additional References
Notes
Workbox is a powerful library originally developed by members of Chrome's developer relations team to facilitate the creation of Progressive Web Apps and to improve the offline experience of web applications. It offers a suite of tools and strategies for efficiently caching and serving web assets, managing service workers, and handling offline scenarios. Workbox simplifies the implementation of common caching patterns and provides developers with a comprehensive toolkit to build robust, resilient web applications.
Intro to Service Workers
Overview
To view a running list of service workers, enter chrome://serviceworker-internals/
into your address bar.
Service workers are specialized JavaScript assets that act as proxies between web browsers and web servers. They aim to improve readability by providing offline access, as well as boost page performance.
On the very first visit to a web page that installs a new service worker, the initial visit to a page provides its baseline functionality while the service worker downloads. After a service worker is installed and activated, it controls the web page to offer improved reliability and speed.
An indispensable aspect of service worker technology is the Cache
interface, which is a caching mechanism wholly separate from the HTTP cache. It can be accessed within the service worker scope and within the scope of the main thread.
Whereas HTTP cache is influenced through caching directives specified in HTTP headers, the Cache
interface is programmable through JavaScript.
Caching strategies make offline experiences possible, and can deliver better performance by side-stepping high latency revalidation checks the HTTP cache kicks off.
The interaction between a service worker and a Cache
instance involves two distinct caching concepts: precaching and runtime caching,
- Precaching is the process of caching assets ahead of time, typically during a service worker's installation. With preaching, key static assets and materials needed for offline access can be downloaded and stored in a
Cache
instance. This kind of caching also improves page speed to subsequent pages that require precached assets. - Runtime caching is when a caching strategy is applied to assets as they are requested from the network during runtime. This kind of caching is useful because it guarantees offline access to pages and assets the user has already visited.
Service workers are like web workers in that all the work they do occurs on their own threads. This means service workers tasks won't compete for attention with other tasks on the main thread.
A Service Worker's Life
The idea of control is crucial to understanding how service workers operate. A page described as being controlled by a service worker is a page that allows a service worker to intercept network requests on its behalf. The service worker is present and able to do work for the page within a given scope.
A service worker's scope is determined by its location on a web server. If a service worker runs on a page located at /subdir/index.html
, and is located at /subdir/sw.js
, the service worker's scope is /subdir/
. Scope limits what pages the service worker controls. The default scope behavior of service workers can be overridden by setting the Service-Worker-Allowed
response header, as well as passing a scope
option to the register
method. You usually want to load a service worker from the root directory of the web server so that its scope is as broad as possible.
A client is any open page whose URL falls within the scope of that service worker. Specifically, these are instances of a WindowClient
.
Lifecycle of a new service worker
- Registration is the initial step of the service worker lifecycle:
// In index.html, for example:
// Don't register the service worker
// until the page has fully loaded
window.addEventListener('load', () => {
// Is service worker available?
if ('serviceWorker' in navigator) {
navigator.serviceWorker.register('/sw.js').then(() => {
console.log('Service worker registered!');
}).catch((error) => {
console.warn('Error registering service worker:');
console.warn(error);
});
}
});
- Because the user's first visit to a website occurs without a registered service worker, wait until the page is fully loaded before registering one.
- When a page is fully loaded and if service worker is supported, register
/sw.js
- Service workers are only available over HTTPS and localhost.
- If a service worker's contents contain syntax errors, registration fails and the service worker is discarded.
- When registration begins, the service worker state is set to
'installing'
- A service worker fires its
install
event after registration.install
is only cached once per service worker, and won't fire again until it's updated. A callback for theinstall
event can be registered in the worker's scope withaddEventListener
:
// /sw.js
self.addEventListener('install', (event) => {
const cacheKey = 'MyFancyCacheName_v1';
event.waitUntil(caches.open(cacheKey).then((cache) => {
// Add all the assets in the array to the 'MyFancyCacheName_v1'
// `Cache` instance for later use.
return cache.addAll([
'/css/global.bc7b80b7.css',
'/css/home.fe5d0b23.css',
'/js/home.d3cc4ba4.js',
'/js/jquery.43ca4933.js'
]);
}));
});
- This creates a new
Cache
instance and precaches assets.event.waitUntil
accepts a promise, and waits until that promise has been resolved. - Create a new
Cache
instance named'MyFancyCache_v1'
- After the cache is created, an array of asset URLs are preaches using its asynchronous
addAll
method
Installation fails if the promise(s) passed to event.waitUntil
are rejected. If this happens, the service worker is discarded. If the promises resolve, installation succeeds and the service worker's state will change to 'installed'
and will then activate.
- If registration and installation succeed, the service worker activates, and its state becomes
'activating'
. Work can be done during activation in the service worker'sactivate
event. A typical task in this event is to prune old caches, but for a brand new service worker, this isn't relevant. - Once activation finished, the service worker's state becomes
'activated'
.
- Once activation finished, the service worker's state becomes
Handling Service Worker Updates
Once the first service worker is deployed, it'll likely need to be updated later. Browsers will check for updates to a service worker when:
- The user navigates to a page within the service worker's scope
navigator.serviceWorker.register()
is called with a URL different from the currently installed service worker - but don't change a service worker's URL!navigator.serviceWorker.register()
is called with the same URL as the the installed service worker, but with a different scope. Again, avoid this by keeping the scope at the root of the origin if possible.- When events such as
'push'
or'sync'
have been triggered within the last 24 hours.
Assuming a service worker's URL or scope is unchanged, a currently installed service worker only updates to a new version if its contents have changed. Browsers detect changes in a couple of ways:
- Any byte-for-byte changes to scripts requested by
importScripts
, if applicable. - Any changes in the service worker's top level code, which affects the fingerprint the browser has generated of it.
The browser does a lot of heavy lifting here. To ensure the browser has all it needs to reliably detect changes to a service worker's contents, don't tell the HTTP cache to hold onto it, and don't change its file name. The browser automatically performs update checks when there's a navigation to a new page within a service worker's scope.
A manual update may be needed, especially for single page applications. In such situations, a manual update can be triggered on the main thread:
navigator.serviceWorker.ready.then((registration) => {
registration.update();
});
One thing to note is that an updated service worker gets installed alongside the previous one. This means the old service worker is still in control of any open pages, and following installation, the new one enters a waiting state util it's activated. By default, a new service worker will activate when no clients are being controlled by the old one. This occurs when all tabs for the relevant website are closed.
- When a service worker is installed and the waiting phase ends, it activates, and the old service worker is discarded. A common task to perform in an updated service worker's
activate
event is to prune old caches. Removing old caches by getting the keys for all openCache
instances with thecache.keys
and deleting the caches that aren't in a defined allow list withcaches.delete
:
- When a service worker is installed and the waiting phase ends, it activates, and the old service worker is discarded. A common task to perform in an updated service worker's
self.addEventListener('activate', (event) => {
// Specify allowed cache keys
const cacheAllowList = ['MyFancyCacheName_v2'];
// Get all the currently active `Cache` instances.
event.waitUntil(caches.keys().then((keys) => {
// Delete all caches that aren't in the allow list:
return Promise.all(keys.map((key) => {
if (!cacheAllowList.includes(key)) {
return caches.delete(key);
}
}));
}));
});
Old caches don't tidy themselves. We need to do that ourselves or risk exceeding storage quotas.
Caching Strategies
To use service workers effectively, it's necessary to adopt one or more caching strategies, which requires a bit of familiarity with the Cache
interface.
A caching strategy is an interaction between a service worker's fetch
event and the Cache
interface.
- The
Cache
interface is a caching mechanism entirely separate from the HTTP cache. - Whatever
Cache-Control
configuration you use to influence the HTTP cache has no influence on what assets get stored in theCache
interface.
The Cache
interface is a high-level cache driven by a JavaScript API. This offers more flexibility than when using relatively simplistic HTTP key-value pairs, and is one half of what makes caching strategies possible.
CacheStorage.open
to create a newCache
instance.Cache.add
andCache.put
to store network responses in a service worker cache.Cache.match
to locate a cached response in aCache
instanceCache.delete
to remove a cached response from aCache
instance
// Establish a cache nameconst cacheName = 'MyFancyCacheName_v1';
self.addEventListener('install', (event) => {
event.waitUntil(caches.open(cacheName));});
self.addEventListener('fetch', async (event) => {
// Is this a request for an image?
if (event.request.destination === 'image') {
// Open the cache
event.respondWith(caches.open(cacheName).then((cache) => {
// Respond with the image from the cache or from the network
return cache.match(event.request).then((cachedResponse) => {
return cachedResponse || fetch(event.request.url).then((fetchedResponse) => {
// Add the network response to the cache for future visits.
// Note: we need to make a copy of the response to save it in
// the cache and use the original as the request response.
cache.put(event.request, fetchedResponse.clone());
// Return the network response
return fetchedResponse;
});
});
}));
} else {
return;
}});
The above code does the following:
- Inspect the request's
destination
property to see if this is an image request. - If the image is in the service worker cache, serve it from there. If not, fetch the image from the network, store the response in the cache, and return the network response.
- All other requests are passed through service worker with no interaction with the cache.
Cache Only
- When the service worker is in control of the page, matching requests will only ever go to the cache. This means that any cached assts will need to be preached in order to be available for the pattern to work, and that those assets will never be updated in the cache until the service worker is updated.
Network Only
The opposite of Cache Only
is Network Only
, where a request is passed through a service worker to the network without any interaction with the service worker cache. This is a good strategy for ensuring content freshness, but the tradeoff is that it will never work when the user is offline.
Cache first, falling back to network
For matching requests, the process goes like this:
- The request hits the cache. If the asset is in the cache, serve it from there.
- If the request is not in the cache, go to the network
- Once the network request finishes, add it to the cache, then return the response from the network.
This is a great strategy to apply to all static assets (such as CSS, JavaScript, images, and fonts), especially hash-versioned ones.
// Establish a cache nameconst cacheName = 'MyFancyCacheName_v1';
self.addEventListener('fetch', (event) => {
// Check if this is a request for an image
if (event.request.destination === 'image') {
event.respondWith(caches.open(cacheName).then((cache) => {
// Go to the cache first
return cache.match(event.request.url).then((cachedResponse) => {
// Return a cached response if we have one
if (cachedResponse) {
return cachedResponse;
}
// Otherwise, hit the network
return fetch(event.request).then((fetchedResponse) => {
// Add the network response to the cache for later visits
cache.put(event.request, fetchedResponse.clone());
// Return the network response
return fetchedResponse;
});
});
}));
} else {
return;
}});
Network first, falling back to cache
- You go to the network first for a request, and place the response in a cache.
- If you're offline at a later point, you fall back to the latest version of that response in the cache.
This strategy is great for HTML or API requests when, while online, you want the most recent version of a resource, yet you want to give offline access to the most recent available version.
// Establish a cache nameconst cacheName = 'MyFancyCacheName_v1';
self.addEventListener('fetch', (event) => {
// Check if this is a navigation request
if (event.request.mode === 'navigate') {
// Open the cache
event.respondWith(caches.open(cacheName).then((cache) => {
// Go to the network first
return fetch(event.request.url).then((fetchedResponse) => {
cache.put(event.request, fetchedResponse.clone());
return fetchedResponse;
}).catch(() => {
// If the network is unavailable, get
return cache.match(event.request.url);
});
}));
} else {
return;
}});
Stale-while Revalidate
- On the first request for an asset, fetch it from the network, place it in the cache, and return the network response.
- On subsequent requests, serve the asset from the cache first, then
in the background
, request it from the network and update the asset's cache entry. - For requests after that, you'll receive the latest version fetched from the network that was placed in the cache in the prior step.
This is an excellent strategy for things that are sort of important to keep up to date, but are not crucial. Think of stuff like avatars for a social media site.
// Establish a cache nameconst cacheName = 'MyFancyCacheName_v1';
self.addEventListener('fetch', (event) => {
if (event.request.destination === 'image') {
event.respondWith(caches.open(cacheName).then((cache) => {
return cache.match(event.request).then((cachedResponse) => {
const fetchedResponse = fetch(event.request).then((networkResponse) => {
cache.put(event.request, networkResponse.clone());
return networkResponse;
});
return cachedResponse || fetchedResponse;
});
}));
} else {
return;
}});
Workbox Overview
Good abstractions make APIs easier to use. That's where Workbox comes in. Workbox is a set of modules that simplify common service worker routing and caching. Each module available addresses a specific aspect of service worker development. Workbox aims to make using service workers as easy as possible, while allowing the flexibility to accommodate complex application requirements where needed.
In the simplest cases, workbox-build
offers a couple methods that can generate a service worker that precaches specified assets. The generateSW
method does most of the work out of the box, while the injectManifest
method offers more control when necessary.
For more advanced use cases, other modules can help. A few such models are:
workbox-routing
for request matchingworkbox-strategies
for caching strategiesworkbox-precaching
for precachingworkbox-expiration
for managing cachesworkbox-window
for registering a service worker and handling updates in thewindow context
What You Need to Know
Precaching may cause you to hit problems id you apply precaching to too many assets, or if the service worker is registered before the page has a chance to finish loading critical assets. Since the default behavior of workbox-webpack-plugin
is to instruct the service worker to automatically precache generated assets, this can be problematic in a way that's easy to miss. When a service worker precaches assets during installation, one or more network requests kick off simultaneously. This has the potential to be problematic for the user experience if not timed right.
If a service worker precaches anything, then the time at which it's registered matters. Service workers are often registered using line <script>
elements. This means HTML parsers may discover service worker registration code before the page's critical assets have loaded. This is a problem. A service worker should ideally be performance-neutral in the worst of cases, not to make performance worse.
Do users a favor and register a service worker when the page's load event fires.
Precaching involves dispatching network requests. If a manifest of assets to precache isn't carefully curated, the result may be some amount of waste. When precaching, consider cutting out especially large assets and rely on runtime caching to capture them rather than making costly assumptions.
Caching strategies that consult the cache first—or only consult the cache—are great for both offline access and performance. However, they tend to cause issues in some select cases.
This can be a problem when the static assets have names that don't have a content-based hash in them. The solution is to use a strategy that consults the network for updates, like network-first or stale-while-revalidate. Regardless, strongly consider versioning static assets, whether by a hash in the asset name, or in the query string. This will avoid stale assets in service workers that use cache-first runtime strategies for static assets.
You can achieve finer control of caches with the workbox-expiration module.
Sometimes a buggy service worker gets deployed, and then there are problems. All it takes to deal with a buggy service worker is to deploy a basic no-op service worker that installs and activates immediately without a fetch
event handler.
// sw.js
self.addEventListener('install', () => {
// Skip over the "waiting" lifecycle state, to ensure that our
// new service worker is activated immediately, even if there's
// another tab open controlled by our older service worker code.
self.skipWaiting();});
self.addEventListener('activate', () => {
// Optional: Get a list of all the current open windows/tabs under
// our service worker's control, and force them to reload.
// This can "unbreak" any open windows/tabs as soon as the new
// service worker activates, rather than users having to manually reload.
self.clients.matchAll({
type: 'window'
}).then(windowClients => {
windowClients.forEach((windowClient) => {
windowClient.navigate(windowClient.url);
});
});});
The service worker will install and activate immediately by calling seld.skipWaiting()
in the install
event.
It's very important that a no-op service worker contains no fetch
event handler.
By far the most effective way to test a service worker is to rely on private browsing windows, such as incognito windows in Chrome, or Firefox's Private Browsing feature. Every time you open a private browsing window, you start fresh. There are no active service workers, and no open Cache
instances.
The best time to use navigation preload is when a website can't precache HTML. Think of websites where markup responses are dynamic and vary with stuff like authentication state. Navigation requests for these may use a network-first (or even a network-only) strategy, and that's where navigation preload can make a big difference.
Do: precache critical static assets
Think of critical assets as those utterly necessary to provide a user experience:
- Global stylesheets
- JavaScript files that provide global functionality
- Application shell HTML, if that applied to your architecture
Do: precache an offline fallback for multiple websites.
Don't: Precache responsive images or favicons.
Don't: Precache polyfills
All browsers impose an upper limit on the amount of storage that your web app's origin is allowed to use. You can configure Workbox to automatically clean up the data it caches at runtime in order to avoid running into storage quota limitations that may impact the caching efficiency and reliability of your website.
When setting up a route and runtime caching strategy, you can add in an instance of ExpirationPlugin
from workbox-expiration
configured with settings that make the most sense for the type of assets you're caching.
Use workbox-window
The goals of workbox-window
are:
- To simplify service worker registration and updates by helping developers identify critical moments of the service worker lifecycle, making it easier to respond in those moments
- To prevent developers from making common mistakes, such as registering a service worker in the wrong scope.
- To simplify messaging between the
window
and the service worker scope.
Caching Resources during Runtime
Force Network Timeout
There are times when you have a network connection, but that connection is either too slow or your connection is lying to you that you're online. There are instances in which falling back to your last cached response for an asset or page after a certain period of time would be preferable - yet another problem that Workbox can help with.
Access Caches from the Window
You can access Cache
instances in both the service worker scope and in your web app's traditional code, running in the window
. This makes it easier for both the user to directly interact with a service worker cache or update the user interface based on cache state.
One potential use case is to offer a "save for offline" feature for pages the user may want to read later, but know they may be offline at that time.
Comments
You have to be logged in to add a comment
User Comments
There are currently no comments for this article.