We’ve all probably seen it – you try to visit a website, perhaps during a sale or trying to buy tickets to a popular event, only to find the site is down. But what is actually happening “behind the scenes” when a website is overloaded?
First of all, a crash often occurs if the volume of visitors exceeds the available website capacity of the services the site relies on.
Think of it like a physical store – every building has a limit to how many people can enter at once. At a certain point it gets uncomfortable for the occupants, and then dangerous – eventually there is simply no more room for further people to come in. The same happens when a website server is overloaded.
It’s not just the overall number of visitors (the volume) that causes issues. It’s also how quickly new visitors are entering (the velocity). Think of a stadium with turnstiles for entry – there is a maximum throughput of fans coming into those entry points, and to push too hard would cause discomfort or even injury. Websites use resources to respond to requests – too many at once can overwhelm them and deliver a poor experience to visitors.
There are more ways than you might think for a website to suddenly become overwhelmed. Some of them are expected or even self-inflicted; most commercial websites want more traffic to increase revenues either through product or service sales, or to generate ad revenue (especially journalism websites).
Marketing activity often drives large amounts of traffic to websites (if done properly), and certain events like sales or ticket releases are known about in advance.
This doesn’t mean that it’s necessarily easier to prepare websites to cope for the increased traffic. For example, when fashion retailer JD Williams debuted its first TV advert in a primetime slot, their website immediately experienced a spike in traffic 27 times greater than Cyber Monday.
Viral or influencer marketing can also be unpredictable and can cause big jumps in traffic at unexpected times.
Many websites provide public services or are important for non-commercial reasons. Government services like filing tax returns, or educational services like handing in assignments or checking exam results, are now often done online and deadlines for these create rushes in demand.
Bots are another common cause of website overload. Around 50% of all web traffic is now automated, with 30% of traffic being malicious in its intent. Malicious or not, this traffic still needs to be processed by servers and can add overhead, especially on busy days.
Bot attacks can be very aggressive depending on their intention. Scalper bots looking to snap up all the stock in a sale event can come from multiple adversaries, causing thousands of additional requests in a short span of time. Other threats than create strain on web servers include credential stuffing and scraping, either of content or for the purposes of reconnaissance for a later attack.
As more users enter the website, each service that the site relies on (databases, caches, payment gateways, third-party tools etc.) uses more resources. Everything is fine until any one of those services hits its limit.
This might be an infrastructure limitation – CPU, memory, total connections, file handlers or bandwidth for example – or it might be a service limitation, such as a third-party API usage limit, or a data pipeline ingest capacity limit.
Hopefully, these eventualities were uncovered during performance testing and mitigations planned for, but even with a load balancing solution in place, scalability is never instantaneous. The more complex the system, the more hidden bottlenecks there are.
Depending on the service and what limit was hit, the site will show some strange behavior. For instance, it’s quite common for a website struggling to cope with its traffic to lose all its styling or images, presenting new visitors with a strange text-only version.
Third-party services can also crumple before other parts of the site, for example the payment gateway, meaning all can look fine until customers come to checkout and pay.
Ticketing sites could even find that syncing breaks between ticket availability reporting and what is displayed to customers – a real source of frustration that would need to be addressed immediately.
One common symptom of an overloaded website is timeouts, where a request is sent but not dealt with immediately. Requests are handled as soon as a service has the free capacity to deal with them. When a resource limit is exceeded, the requests coming in exceeds the speed at which requests are dealt with. Exactly how long a request will wait before being processed varies, but it is typically up to a minute.
Eventually this list of pending requests grows to the point that no requests are being dealt with, as the service is dealing with requests that may already have been cancelled. Even if a site is put into maintenance mode so that no new traffic can hit the site, the service may have such a backlog of requests that it may take a long time just to work through them.
In this case, it’s usually faster to just restart the service to clear the backlog, but even that might not be possible if the requests are important, e.g. for an order processing service, in which case the only course of action is to wait for it to work through them.
If the high traffic is unexpected (for example, if an influencer published a post outside of the agreed schedule), the brand may not even know why these errors are happening until they look at their traffic logs.
When a site slows down, users get frustrated and are less likely to browse and buy. Google crawler bots visiting the site at these times will also notice the drop in response times, which is bad news for SEO as site speed is a key ranking factor for search engines.
If a site isn’t behaving as expected customers could be receiving a poor or unfair experience. Some may be able to buy rare stock in a sale whilst others may be blocked for unexplained reasons. This can harm brand reputation, especially on social media or via word of mouth.
Of course, a completely crashed site can’t function at all. That means no sales, no bookings, no assignment submissions, or whatever else the site is supposed to do can’t be done, until the issue is fixed.
Any marketing spend allocated to getting visitors on the site is also wasted. Even worse for customer loyalty, visitors are likely to go to competitors to spend their money instead and may not come back.
In a worst-case scenario, important government or public services could be disrupted if the site meant to facilitate it goes down due to overwhelming traffic. The knock-on effects could be disastrous if essential services become unavailable.
Even after a site goes down, demand could still be flooding in as frustrated users keep hitting “refresh” to get access in a negative feedback loop. And if the site can be brought back online, as soon as customers realize this the demand will likely peak once again (at great velocity) and the website may quickly come back down.
As services will have broken individually on the way to a complete crash, there are likely to be various parts of the site to fix before functionality can be restored. This can all take significant time, sometimes out of hours from highly technical staff, which is very costly.
Website owners use a virtual waiting room as a safety net in front of their website to prevent all these things from happening.
TrafficDefender’s virtual waiting room sits in front of the target website keeping track of how much traffic is coming and going at all times. Should the volume of traffic hit a threshold determined by the website owner, any new visitors will be redirected to an online queue and only allowed in once some users have left the site and there is the website capacity to serve more.
TrafficDefender is also designed to deliver a seamless, fully branded UX to waiting visitors. Those waiting are kept informed of where they are in the queue and how long their wait will be.
Get a free demo of TrafficDefender and ensure your site never falls down under heavy traffic conditions.