Page 2 of 2

Re: Robot.txt

Posted: Fri Mar 17, 2017 10:57 am
by AmigoJack
As for "non ssl htaccess.zip": no. Still 97 duplicates and still entries like

Code: Select all

SetEnvIfNoCase User-Agent "^Wget/1.5.2" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.5.3" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.6" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.7" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.8" bad_bot
instead of just

Code: Select all

SetEnvIfNoCase User-Agent "^Wget/1.5.[23]" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.[6-8]" bad_bot
as well as redundancies like

Code: Select all

SetEnvIfNoCase User-Agent "^Webster.*" bad_bot
SetEnvIfNoCase User-Agent "^Webster.Pro.*" bad_bot

Don't try it, learn it. Use an editor that is capable of both sorting lines and killing duplicates. Learn about regular expressions.

Re: Robot.txt

Posted: Sat Mar 18, 2017 11:05 am
by kaspir
HiFiKabin wrote:
Fri Mar 17, 2017 9:34 am
New and improved (I hope) list of bad/aggressive bots and scrapers now available in .htaccess form for ssl and non ssl sites as well as the raw list for your own use https://phpbb.hifikabin.me.uk/viewtopic.php?f=6&p=324
Knew you would! BOUSS