This is neither a support request, neither a fix, but more a report on what happened to one of the forums that I maintain earlier today: a Distributed-Denial-of-Service attack, originating in China, with no purpose I could figure out except to put the forums down.
So, how was this accomplished? In fact, they used a very simple strategy: they attempted to log in from a group of servers (I counted manually around two dozens, but there were possibly more), all coming from the 220.127.116.11/16 range, 'faking' a different User-Agent tag each time (so that it always looked like a different attempt, even if coming from the same IP address). They timed the requests carefully so that they didn't hit any request limits — from the perspective of the web server, it 'seemed' that different requests would come from different machines (or different browsers under the same address) at a 'reasonable' rate, say, just one per second, which would go through most kinds of protective measures.
Now, my nginx/PHP setup can most definitely handle one request per second or even two dozen requests per second — that would really be way within the threshold. The problem was that each of those requests would require opening a new session (with its own cookie, etc.). Each session will 'hit' the database at least once, to figure out if it's the 'same' request or not (in this case, it wouldn't be the same) or if a new session had to be created. Very quickly this grew to a million open sessions or more; however, this was not the real issue, MySQL/MariaDB can most certainly handle tables with millions of entries.
No, the problem was that each time a session was created, there is a check made to see how many sessions already exist for this particular user (
SELECT COUNT(session_id) [...] FROM [table-prefix]sessions) and update these accordingly. With a 'usual' amount of open sessions — say, a dozen, a hundred, even some twenty thousand, as I've seen in some really large forums — this is handled by the database in nanoseconds (well... perhaps microseconds). But if you start having a million open sessions, and have to count them each time, and before the count is finished, a new request pops up requiring the count operation to be repeated... well, you get the idea. At some point, no matter how much optimisation is in your system, no matter how many results are being cached (because each new session will get a different result, there will be a limit to how many operations can actually be cached — I seriously suspect that 'close to none', except in those very few cases when 2 or 3 requests come in before any of them has created a new session, in which case the count will be the same), the database is basically doing nothing else but counting sessions over and over and over, and as the number of requests for new sessions are coming in more and more, it basically 'stalls' because it cannot proceed further — and that means that the code cleaning up the unused sessions will never get a chance to run: the sessions table will grow faster than the cleaning-up code can do its job; and, in any case, the database will be way too busy counting the number of sessions instead of worrying about anything else.
You can take a look at what's going on inside the file phpbb/sessions.php, around lines 778 and further around 838 (for phpBB 3.2.4). You might notice something already very cleverly done by the core programmers: the very same problem might have been triggered by 'legitimate' bots, namely those which are indexing content for Google, Bing, and similar services. Those bots use well-known User-Agents, so the code can be optimised for them: in their case, for example, phpBB considers them to be a 'special' anonymous user, and reuses sessions (instead of creating a new one every time those bots connect). This works quite well, and I presume that as more legitimate bots pop up, the core programmers will happily add them to the database table.
To prevent the DDoS attack I got, however, something else needs to be done code-wise. Remember that the trick that these cyberbullies did was to rotate requests among a few dozen servers, making sure that each IP would make only 'reasonable' amounts of requests, but using different User-Agent handles for each request. From the perspective of phpBB, it has no idea that these are anything but 'legitimate' users of some sort, so all these requests were honoured. To make matters even worse, from the perspective of the nginx/PHP setup, the amount of requests was also considered to be 'reasonable' for all purposes, since they seemed to be genuine requests coming from different IP addresses and different User-Agents (e.g. different browsers on different devices). Thus, filtering at that level, the attacks would simply go through; even limiting the rate of requests to a very low threshold did little to help (especially after the sessions table already had over a million entries), because, as a matter of fact, it would be far more likely to restrict legitimate users (take into account that each legitimate browser will require several requests, for downloading images, CSS, and JS...) than the illegitimate attacks.
In my setup, I have CloudFlare in front of everything, to deal exactly with these kinds of things. But, of course, even CloudFlare is not sufficiently smart to 'detect' that this was an 'attack' and not merely a lot of genuine traffic. It required a human (yours truly) to block the 18.104.22.168/16 range and escalate the level of defence to 'Under attack' mode, which triggers an irritating additional page where the browser is checked to see if there is a human behind it, and if so, allows it through the website. This effectively blocked the attack. I left the settings like that for several hours, because, as you may imagine, besides this very specific DDoS attack, there are always — every day, every hour — some typical spambots trying to get access to our forums (with little success). These, however, are not that 'harmful' — after a few tries without success, the spammer simply moves to the next forum on their list, and gives up on those which are too hard to break into...
Now, how could this DDoS attack be prevented?
I'm not a professional programmer by any account, so I'm unfamiliar with the best approach to this issue, but I can imagine that something could be done to limit the impact of 'fake' requests on the database. And this would be to at least locally cache the count of open sessions: after all, if we're talking about humans, it's unlikely — except on the busiest forums! — that hundreds of new sessions would be created at the same time, so it would make sense to cache the result of the
SELECT COUNT(session_id) [...]command, for, say, one second. One second makes a huge difference — it's almost negligible for a human, but it means quite a lot for modern CPUs and databases which work at the microsecond level.
It would also make sense to limit the number of open sessions to a reasonable default — this could be a configuration parameter. Remember that there are three different cases here: the first is a legitimate user, which opens a new session but really uses it. Then there are the legitimate content-indexing bots, which get the 'same' session using the already-existing trick implemented in the code. And finally, there is the case of someone opening a session but failing to put their password and actually logging in, which could be much more restricted than it is. We have, after all, limits in place for the number of failed logins. However, in the scenario I've described, the problem was not with people logging in, but rather with 'anonymous usage' of the forums — people who just want to see the public areas of the forums without having any intention of becoming regular users — and these could theoretically be limited to a reasonable default. Hundreds or perhaps even thousands of simultaneous anonymous users would be fine in most setups; a million would certainly not be acceptable. So having a reasonable default — say, 100 — would make a lot of sense.
In the case of a future DDoS attack like the one I suffered, what would simply happen would be that the first 100 requests would be promptly answered, but the 101st would be refused and the connection shut down, no matter who tries to see the content (note that legitimate, logged-in users or content-indexing bots would not be affected, but only anonymous users). This might be acceptable in many communities. Others, however, wish to have as many viewers (registered or not) as possible, in order to leverage ads to get revenue; again, such forums might wish a much higher number of simultaneous users, each with a valid, open session. But I find it hard to believe that there is a scenario where a forum admin really wants a million, or ten million, of simultaneous anonymous users. Nevertheless, that option might be possible, with a fair warning that allowing too many anonymous users.
Note that there is already a fair amount of configuration that can be done for the 'Anonymous' user. However, AFAIK, there is no way to configure a specific amount of maximum anonymous users that are allowed to be simultaneously logged in, but I guess that this ought to be a relatively simple thing to implement... not for me, though, I'm simply not familiar enough with the phpBB code to do that!
Also note that I'm not really requesting anything, just opening the discussion to see what you guys think — namely, how serious these kinds of DDoS attacks really are, in the sense of being worth preventing them by adding some extra checks to the code. Maybe these are so rare — compared to the effort in preventing spam! — that it's worthless to change the core code to deal with this very specific case; maybe it justifies an extension or something else but not an actual change to the code. But I think it's still worth reflecting on the issue simply because anything in the code that can be abused in order to bring a system down is, IMHO, worth preventing!