Upscaling server capacity – part 1

At Time to Change we generally get around 1000 visitors to the site a day. On peak days, like World Mental Health Day, this might double, but there’s enough flex in our server capacity to manage that without issue. Once a year though, the first Thursday in February, we run Time to Talk Day – a massive single day campaign to get the whole country talking a mental health. We’re lucky, it’s very successful, bigger and better every year we do it – just last week I was in a meeting with someone from an agency wanting us to pay to trend that day, we were all quick to point out we organically trend all day anyway, without paying a penny. Very, very lucky.

The downside though, if we can call it that, is that we’ve now reached a point where our site’s infrastructure is not equipped to deal with the spike in traffic, which last year reached 32,000 visitors and tripled the time it took pages to load for supporters wanting to take part in the day. It never went down, which was a relief, but it was painfully slow, despite our most ambitious estimates and load tests in the preceding months.

So it’s my job this year to make sure we’re ready, not ready for a sensible growth on last year, but really ready for the kind of numbers we feel arrogant even talking about – just in case. In previous jobs I’ve just called the hosting company to increase the server capacity, maybe even kick a few smaller charities off to give us maximum breathing room. But the issue at Time to Change is we’re already on the biggest physical server, at its highest capacity, so the standard option isn’t an option for us. Ruling in and out the various alternatives has been pretty stressful as they all carry risks and, as ever, it’s a ridiculous time of year with millions of other urgent things happening at the same time.

So what are the options?

  1. Caching – installing this (I won’t say which for site security reasons) actually made more of a difference than I thought it would, increasing performance by 20%, but we need more than that to get through Time to Talk Day
  2. Move to a virtual server – this does solve the immediate issue as the capacity is exponentially better, but then we’re left with a problem of poor performance due to underload the rest of the year, so it’s not a good long-term solution
  3. Temporary, reversible migration to a virtual server – this is a possibility but a very risky one as you never really know how your site’s going to perform in a new environment until it’s had some time to bed in and be tested to its limits in a live setting – none of these we really have time for
  4. A microsite – if I was a web developer I’d probably go for this, move the entire problem into an isolated container that guarantees the stability of the main site? Sounds perfect. Unfortunately I work in comms and microsites are a brand and UX sacrifice I can’t live with, so we’re not doing that
  5. Varnish – the Iron Man of caching systems it turns all your dynamic content static (except the bits we need on the day) and improves performance by about 50%
  6. Match our current server with another and load balance on the day – like an overflow unit for when the traffic hits its peak

I’m going for 5 and 6, we’ve got just over a month to get it right and in the meantime I’m site auditing our 10,000 pages to make sure we’re only working as hard as we need to. In part 2 I’ll let you know how it went!

PS – another way to find out is to take part in the day, 4 Feb 2016, let’s get the nation talking about mental health.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: