Bots eating up all the bandwidth

Discussion of non-phpBB related topics with other phpBB.com users.
Forum rules
General Discussion is a bonus forum for discussion of non-phpBB related topics with other phpBB.com users. All site rules apply.
User avatar
supanet
Registered User
Posts: 246
Joined: Sat Dec 15, 2012 4:20 pm
Location: UK

Bots eating up all the bandwidth

Post by supanet »

I look after a website that is getting a lot of bots, everytime i visit the site it has over 200 bots online at any one time.

The site is allocated 100GB of Bandwidth each month and every month it is used up (sometimes making the site go down)

The site has been going for over 10 years and used to be very busy but now only gets a handful of members on each day (less than 10) so i know it's not that.

The site is run through Cloudflare (free version) but with only basic settings as i am not sure how to set things up on it.

It used to be a lot of chinese bots but lately it is more of these:
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)

Lots of random IP's so hard to ban.

Any suggestions on how i can reduce the bandwidth usage?
User avatar
KevC
Support Team Member
Support Team Member
Posts: 72566
Joined: Fri Jun 04, 2004 10:44 am
Location: Oxford, UK

Re: Bots eating up all the bandwidth

Post by KevC »

Already a couple of topics on this as others have seen the same with other bots

viewtopic.php?t=2652265
viewtopic.php?t=2654353
-:|:- Support Request Template -:|:-
Image
"Step up to red alert. Sir, are you absolutely sure? It does mean changing the bulb"
User avatar
supanet
Registered User
Posts: 246
Joined: Sat Dec 15, 2012 4:20 pm
Location: UK

Re: Bots eating up all the bandwidth

Post by supanet »

KevC wrote: Mon Jul 01, 2024 3:31 pm Already a couple of topics on this as others have seen the same with other bots

viewtopic.php?t=2652265
viewtopic.php?t=2654353
Thanks will take a look. :mrgreen:
User avatar
supanet
Registered User
Posts: 246
Joined: Sat Dec 15, 2012 4:20 pm
Location: UK

Re: Bots eating up all the bandwidth

Post by supanet »

Ok just in case anyone has a similar problem this one worked for me.
viewtopic.php?p=16018150#p16018150

Code: Select all

BrowserMatchNoCase "libwww-perl" bad_bot
BrowserMatchNoCase "wget" bad_bot
BrowserMatchNoCase "LieBaoFast" bad_bot
BrowserMatchNoCase "Mb2345Browser" bad_bot
BrowserMatchNoCase "zh-CN" bad_bot
BrowserMatchNoCase "MicroMessenger" bad_bot
BrowserMatchNoCase "zh_CN" bad_bot
BrowserMatchNoCase "Kinza" bad_bot
BrowserMatchNoCase "Bytespider" bad_bot
BrowserMatchNoCase "Baiduspider" bad_bot
BrowserMatchNoCase "Sogou" bad_bot
BrowserMatchNoCase "Datanyze" bad_bot
BrowserMatchNoCase "AspiegelBot" bad_bot
BrowserMatchNoCase "adscanner" bad_bot
BrowserMatchNoCase "serpstatbot" bad_bot
BrowserMatchNoCase "spaziodat" bad_bot
BrowserMatchNoCase "undefined" bad_bot
BrowserMatchNoCase "claudebot" bad_bot
BrowserMatchNoCase "facebook" bad_bot
BrowserMatchNoCase "Petalbot" bad_bot
BrowserMatchNoCase "YandexBot" bad_bot
BrowserMatchNoCase "Applebot" bad_bot
BrowserMatchNoCase "aiohttp" bad_bot
Order Deny,Allow
Deny from env=bad_bot
Deny from 47.76.0.0/16
Add this to your .htaccess file
User avatar
thecoalman
Community Team Member
Community Team Member
Posts: 6309
Joined: Wed Dec 22, 2004 3:52 am
Location: Pennsylvania, U.S.A.

Re: Bots eating up all the bandwidth

Post by thecoalman »

It easier and more effective to block them in Cloudlfare's firewall. Go to security section >> WAF and then click the link for custom rules.

You can add a rule for user agents, you add multiple user agents to single rule using OR. e.g if useragent contains zyz OR abc block the request.

As far as the countries if you expect most of your traffic from a single country or a few countries instead of trying to block countries whitelist ones you expect your legitimate traffic. e.g. if a country does equal US or CA then use a JS Challenge. Of course you would use whatever countries you expect legitimate traffic. The JS challenge does not outright block traffic but it's difficult for bots to get through. Legitimate traffic should be fairly seamless with little inconvenience.
“Results! Why, man, I have gotten a lot of results! I have found several thousand things that won’t work.”

Attributed - Thomas Edison
User avatar
supanet
Registered User
Posts: 246
Joined: Sat Dec 15, 2012 4:20 pm
Location: UK

Re: Bots eating up all the bandwidth

Post by supanet »

Thanks will take a look +1
HB
Registered User
Posts: 230
Joined: Mon May 16, 2005 9:30 pm

Re: Bots eating up all the bandwidth

Post by HB »

thecoalman wrote: Thu Jul 04, 2024 3:08 amIt easier and more effective to block them in Cloudlfare's firewall. Go to security section >> WAF and then click the link for custom rules.
If you want a really blunt axe, go to WAF > Rate limiting rules. Then create a rule like this:
(not cf.bot_management.verified_bot and http.request.uri.path contains ".php") or (cf.bot_management.verified_bot and http.request.uri.path contains ".php")
The free tier only allows one rate limiting rule. If you set the limit to 10 requests per second/10 second timeout, you effectively limit any bot (or person) to less than one page request per second. You can tweak the maximum number of requests as your server capacity allows with a minimum of 1 page every 10 seconds.
User avatar
Forex Station
Registered User
Posts: 185
Joined: Thu Apr 06, 2017 2:26 pm
Location: Australia

Re: Bots eating up all the bandwidth

Post by Forex Station »

supanet wrote: Mon Jul 01, 2024 3:16 pm I look after a website that is getting a lot of bots, everytime i visit the site it has over 200 bots online at any one time.

The site is allocated 100GB of Bandwidth each month and every month it is used up (sometimes making the site go down)

The site has been going for over 10 years and used to be very busy but now only gets a handful of members on each day (less than 10) so i know it's not that.

The site is run through Cloudflare (free version) but with only basic settings as i am not sure how to set things up on it.

It used to be a lot of chinese bots but lately it is more of these:
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)

Lots of random IP's so hard to ban.

Any suggestions on how i can reduce the bandwidth usage?
The .htaccess guide above will solve your issues and if you really want to stop the bots just go into your Cloudflare or .htaccess and temporary block Facebook's facebookexternalhit/1.1 bot.

Is it only the Facebook bot that's eating up your bandwidth? The ones that smash our site are Ahrefs and Semrush which take up almost the same as Google bot at times.
User avatar
supanet
Registered User
Posts: 246
Joined: Sat Dec 15, 2012 4:20 pm
Location: UK

Re: Bots eating up all the bandwidth

Post by supanet »

Thanks for all the advice guys, I have now reduced them down from over 200+ to the odd 7 or 8 this i can handle. :D

Once again great community. :)
User avatar
ssl
Registered User
Posts: 1979
Joined: Sat Feb 08, 2020 2:15 pm
Location: Le Lude, Pays de la Loire - France
Name: Fred Rimbert

Re: Bots eating up all the bandwidth

Post by ssl »

A little personal feedback on these invasive bots.
I do not use Cloudflare and my .htacces file has not been modified, just this addition to the robots.txt file got rid of these invasions

Code: Select all

# Disallow Bad Bot
User-Agent: Claudebot
User-agent: anthropic-ai
User-Agent: ByteDance
User-agent: Bytespider
User-agent: ChatGPT-User
User-agent: GPTBot
User-Agent: FriendlyCrawler
User-agent: Google-Extended
User-agent: Omgili
Disallow: /
User avatar
thecoalman
Community Team Member
Community Team Member
Posts: 6309
Joined: Wed Dec 22, 2004 3:52 am
Location: Pennsylvania, U.S.A.

Re: Bots eating up all the bandwidth

Post by thecoalman »

Usually if they have a user agent they will obey robots.txt but not always. Side note, Cloudflare has a new tool that specifically targets AI bots soaking up pages for AI learning. One click.


https://blog.cloudflare.com/declaring-y ... ngle-click
“Results! Why, man, I have gotten a lot of results! I have found several thousand things that won’t work.”

Attributed - Thomas Edison
User avatar
Terceirense
Registered User
Posts: 56
Joined: Mon Oct 09, 2006 1:19 pm
Location: Açores Ilha Terceira

Re: Bots eating up all the bandwidth

Post by Terceirense »

I used to run a forum with similar issues. Here’s what worked for me:

Added CAPTCHA: I put CAPTCHA on sign-ups and comments. It helped cut down on bot activity.
Tweaked Cloudflare: I went into Cloudflare’s settings and set up rules to challenge or block suspicious traffic. You can do this under Firewall > Tools > IP Access Rules.
Updated .htaccess: I added some rules to block certain bots and IP ranges. If you’re on WordPress, there are plugins for this too.
User avatar
SpIdErPiGgY
Registered User
Posts: 273
Joined: Sun May 02, 2021 2:11 pm
Location: Erpe-Mere, Aalst, BE
Name: Andy Dm

Re: Bots eating up all the bandwidth

Post by SpIdErPiGgY »

fuolo
Registered User
Posts: 4
Joined: Sun Aug 04, 2024 6:51 pm

Re: Bots eating up all the bandwidth

Post by fuolo »

supanet wrote: Wed Jul 03, 2024 2:19 pm Ok just in case anyone has a similar problem this one worked for me.
viewtopic.php?p=16018150#p16018150

Code: Select all

BrowserMatchNoCase "libwww-perl" bad_bot
BrowserMatchNoCase "wget" bad_bot
BrowserMatchNoCase "LieBaoFast" bad_bot
BrowserMatchNoCase "Mb2345Browser" bad_bot
BrowserMatchNoCase "zh-CN" bad_bot
BrowserMatchNoCase "MicroMessenger" bad_bot
BrowserMatchNoCase "zh_CN" bad_bot
BrowserMatchNoCase "Kinza" bad_bot
BrowserMatchNoCase "Bytespider" bad_bot
BrowserMatchNoCase "Baiduspider" bad_bot
BrowserMatchNoCase "Sogou" bad_bot
BrowserMatchNoCase "Datanyze" bad_bot
BrowserMatchNoCase "AspiegelBot" bad_bot
BrowserMatchNoCase "adscanner" bad_bot
BrowserMatchNoCase "serpstatbot" bad_bot
BrowserMatchNoCase "spaziodat" bad_bot
BrowserMatchNoCase "undefined" bad_bot
BrowserMatchNoCase "claudebot" bad_bot
BrowserMatchNoCase "facebook" bad_bot
BrowserMatchNoCase "Petalbot" bad_bot
BrowserMatchNoCase "YandexBot" bad_bot
BrowserMatchNoCase "Applebot" bad_bot
BrowserMatchNoCase "aiohttp" bad_bot
Order Deny,Allow
Deny from env=bad_bot
Deny from 47.76.0.0/16
Add this to your .htaccess file

won't this have negative impact on SEO if the crawlers are blocked?
User avatar
danieltj
Infrastructure Team Member
Infrastructure Team Member
Posts: 520
Joined: Thu May 03, 2018 9:32 pm
Location: United Kingdom
Name: Daniel James

Re: Bots eating up all the bandwidth

Post by danieltj »

fuolo wrote: Tue Aug 06, 2024 9:21 am won't this have negative impact on SEO if the crawlers are blocked?
Yes and no. Obviously, if you block a bot and they honour it then they will not index your forum but the bots that are being blocked here aren't necessarily 'good bots' that you would want to allow access to your forum in the first place.

It's important to make sensible decisions when it comes to bots and their access. Blocking Google will prevent your forum from being index in the future so if Google is visiting your site too much, you might need to look at other methods over basic banning so to speak.
💷 Purchase the Awesome Payments extension today!
Monetise your forum with one off payments and subscriptions.

Need a premium extension created? Send me a PM.

Return to “General Discussion”