Building Firewatch Australia, Part 2 — Scaling on the Cheap

Mike Leonard
5 min readApr 26, 2020

Firewatch Australia is a free app released over the 2019–2020 Australian Bushfire season to help track bushfires. This is part 2 in a series of 2 posts on building the app. Check out the introduction or take a look at the previous post on Data Processing to learn more.

I wasn’t expecting many users when I launched the app, maybe just a few friends, but nonetheless I wanted to make sure it scaled well and didn’t break the bank in terms of infrastructure costs. Turns out this was a good play as the app had several thousand users and tens of thousands of requests just a few days after release.

In the previous post I documented the process for gathering the data using Google Cloud’s serverless Cloud Functions so it might not surprise you to know that I wanted to serve the data via a similar method. I used functions with HTTP triggers to return JSON which served as the API for the app. The functions themselves were pretty simple — they queried data from a Firestore collection and returned it.

The data in Firewatch Australia is fairly static. Any given fire only tends to be updated every few hours or so by NSW RFS and all users are served the same data so they see the same list of fires. As such it’s a great candidate for caching. The goal here is to serve as many requests as possible from a cache so that it’s not constantly hitting the functions and the database incurring both time and cost.

Cloudflare has a great caching service that is outrageously simple. There’s really very little that needs to be done to set it up:

  • First, make Cloudflare the name server for the domain.
  • Configure a CNAME record in Cloudflare’s DNS that contains a URL that traffic should be proxied to.
  • Requests now go to Cloudflare servers which act as a proxy.
  • If the request isn’t in the cache, Cloudflare will make a request to the API and cache the response.
  • If the request is in the cache Cloudflare returns it without ever hitting the API

This works really well. The graph below is straight from the Cloudflare console and shows that the majority of requests are hitting the cache — roughly 92%.

Screenshot of the Cloudflare UI for a single day showing ~92% cache hit rate with a peak of 3100 requests per hour
Screenshot of the Cloudflare UI for a single day showing ~92% cache hit rate with a peak of 3100 requests per hour

Cloudflare has an extensive network with edge nodes located all over the world and, importantly for this app, across Australia. This means Cloudflare is able to serve the content quickly by serving it from the location nearest the user. This network coupled with the fact the data is cached results in a very significant performance increase. In the screen shot below you can see it’s 2–3 times faster when hitting the Cloudflare-backed API (firewatchaus.com) vs hitting the Cloud Function directly.

Speed test of curling the Cloud Function (first request) vs the cached Cloudflare content (second request). It saves about 3.
Speed test of curling the Cloud Function (first request) vs the cached Cloudflare content (second request). It saves about 3.6 seconds of a 5.2 second request.

Remember, all I’ve had to do here is sign up to Cloudflare and configure my nameservers. There’s absolutely no code required up to this point. It’s really quite impressive. If that wasn’t enough, it’s also totally free. Cloudflare has one of the most extensive free tiers of any product I’ve used — up to the point where I was sceptical — but their CEO outlines some really good reasons for this.

I know better than to suggest anything might actually scale infinitely, but certainly for my purposes the API is at a point where scale is unlikely to ever be an issue. The more users the app has the more cache hits there will be so it’s really at the mercy of Cloudflare’s scale which I’m sure is massive and easily handles far more traffic than my app.

If requests do make it to Google Cloud then it’s on a fully serverless stack. I would imagine the bottleneck would be Firestore but interestingly the documentation for Firestore doesn’t mention limits on reads, however the writes limits are 10,000 per second which I don’t think I’d ever exceed.

This isn’t a silver bullet either, the data in Firewatch Australia just happens to lend itself very nicely to this approach.

Cloud Functions with an HTTP trigger have a URL in the form:

https://<region>-<project>.cloudfunctions.net/<function-name>

Because this URL is unique and contains the details to locate the function its kind of important. As such, Cloud Functions don’t support a custom CNAME resolving to them, if you do try this then they will fail with a 404.

Luckily, Google have a solution for this with Firebase functions. This is a little bit of extra work that I wish I didn’t have to do but it’s pretty quick and easy.

  1. Download the Firebase CLI
  2. Login to the CLI with firebase login
  3. Use firebase init to initialise a Firebase project
  4. Edit firebase.json to configure your functions. See the example below.
  5. Connect your custom domain to Firebase

Example firebase.json configuration:

A (fairly big) downside to this is that I appear to be paying egress charges twice — once from the Cloud Functions to Firebase and again from Firebase to the internet (or Cloudflare in this case). I’m not sure if this is something specifically related to how I’ve set it up or if it’s the norm but I will investigate this in the future.

A Google Cloud billing report showing two lots of egress charges with suspiciously similar usage amounts from the two service
A Google Cloud billing report showing two lots of egress charges with suspiciously similar usage amounts from the two services traffic flows through.

There are only two hard things in Computer Science: cache invalidation and naming things.

- Phil Karlton

I don’t remember how I came up with the name “Firewatch Australia” but I don’t think I spent a great deal of time on it. Fortunately, cache invalidation for this app is also pretty easy. There’s a clear trigger for invalidation — when a fire is updated and stored in the database (see the previous post for more on this).

Cloudflare provides a simple API for invalidating certain URLs. This is nice because each time a fire changes it just needs to invalidate the endpoints for that fire and the full list of fires. All other fires can remain in the cache. The request is as follows — you need to supply the zone-id as well as some authentication headers.

POST https://api.cloudflare.com/client/v4/zones/:zone-id/purge_cache {
"files": [
"https://firewatchaus.com/incidents",
"https://firewatchaus.com/incidents/1233456
]
}

The data in Firewatch Australia lent itself really nicely to a straight forward caching solution. Using Cloudflare for this was incredibly simple and easily achieved my original goal of trying to avoid as much traffic as possible hitting my Google Cloud infrastructure and keep the bills low.

I’m pretty happy to say that all this work to handle scale is completely redundant right now. The number of users of the app has dropped to the low tens due to the excellent work our Firies put in bringing the bushfires under control — there’s simply nothing interesting in the app to see at the moment!

Originally published at https://mikeleonard.io on April 26, 2020.

--

--