Browser Cache and Edge Cache Explained
08 Feb 2022
You updated it? Well, yes, but not really.
182 lines
Does the following sound familiar?
"The new feature is live."
"I can't see it."
"Have you cleared your browser cache?"
"Yes. It's still not visible."
"Hmmm…"
This could happen when there are multiple layers of
caching in
play and one of them has not expired.
If you've used something
like Cloudflare CDN,
you've likely ran into this.
What is Browser Caching?
Let's say your website has a logo in the header and it appears on
all pages. When users land on your website and start browsing,
they'll
be downloading the same logo on every page load. This
inefficiency increases bandwidth costs and load times.
When sending the logo to the user, your server can also provide
the Cache-Control response header with an expiration value like
max-age=604800. This will tell the browser to store the logo
locally, on the user's hard drive, for 604800 seconds, which is 7
days. Until that time passes, whenever your website references
the logo through the same URL, the user's browser will use its
cached version. After that, the browser will request the logo
from your server once again and cache it for another 7 days.
What is an Edge?
If your audience is mainly European, it would probably make sense
to put your server in London, for example. But if your website
starts receiving traffic from America, those requests would have
to travel an entire ocean, which can make them several times
slower, due to latency. With a CDN,
you have a system of servers
spread throughout the world that proxy user traffic to
your
server and solve this performance issue.
Among the hundreds of servers in a CDN, the
geographically
closest one to a user is called an edge and is the one that this
user's browser directly communicates with:
What is Edge Caching?
While browser caching helps with many requests by the same user,
it doesn't solve the issue of many users making requests, because
each user has their own browser with their own cache. To fix
this, you need a shared cache.
With a CDN, all user requests are funneled through the closest
edge server. If that server caches the responses, it'll be able
to serve those users without having to bother your own server.
This means that it no longer matters if you have 1 user or 10000,
because once that first response for a resource is served and
cached, the load shifts from your server to the CDN, which can
serve the other 9999 by itself.
Cloudflare caches only static assets by default, such
as images,
fonts, scripts, etc. The HTML containing the page content is not
cached, because if you have an e-commerce store, for example,
each user will have their own shopping cart, and if you cache
that, the next user will see the previous one's items. On the
other hand, an image is expected to be the same for every user
and can safely be cached.
Sites without shopping carts and other dynamic content, such as
blogs, can opt in to cache everything. This makes it
possible for
an edge to serve entire pages all by itself, without having to
contact your server, as long as everything is cached. This way,
you can have a very weak server, yet handle thousands of requests
per second, because the load is mainly on the CDN.
Cache Invalidation
The hard part in caching is picking a
time to live (TTL), which
determines when an entry is stale and should be discarded.
Images and other assets can be cached for longer periods, because
even if you pick a way too long TTL, you can change the URL to
the resource and trigger a fresh request-response cycle:
https://example.com/about-us/image.jpg?version=2
However, if you've chosen to cache the HTML content of
your pages as well, you'd have to get around that too,
since changing the image URL means changing the HTML. So
you'd have to change the page URL as well:
https://example.com/about-us?version=2
This is where things get tricky, because even if your web
framework can easily change all links to the updated page,
every other site on the internet also has to do it.
That's
impossible, so you have to pick a window of time that
you're comfortable with and in which users may see
outdated content.
If you have the following setup:
- Browser cache TTL: 30 minutes
- Edge cache TTL: 1 minute
…there are several possible outcomes:
-
If a change happened over 30 minutes ago, both
caches
would have expired and it's guaranteed that the user
will see the updated version. -
If a change happened over 1 minute ago, only the
edge
cache is guaranteed to have expired. If a user has seen
the page within the last 30 minutes and is cached by
their browser, they can see the change by clearing
their browser cache, opening an incognito window, or
doing a hard reload. -
If a change happened less than 1 minute ago,
both
caches would still be considered fresh. Users will not
be able to see the update, even if they clear their
browser cache, because the closest edge server would
still return the old version. The only solution here is
for the website owner to prematurely force the cache to
become stale by purging it.
Conclusion
Browser cache helps with subsequent requests by the
same
user and makes your website feel snappy, because your
assets don't have to travel over the internet.
Edge cache helps with requests by a geographically close group of users and can greatly reduce server load while improving response times in distant countries.
Pick a lower TTL to reduce the chance of outdated content, or a higher TTL for better cache effectiveness, but be ready to change URLs, open incognito windows, and purge cache.