Spiders & bots to add to phpBB

Do not post support requests, bug reports or feature requests. Discuss phpBB here. Non-phpBB related discussion goes in General Discussion!
Ideas Centre
User avatar
AmigoJack
Registered User
Posts: 5757
Joined: Tue Jun 15, 2010 11:33 am
Location: グリーン ヒル ゾーン
Contact:

Re: Spiders & bots to add to phpBB

Post by AmigoJack »

Sadly I don't keep track of the full agent and when I've encountered them. All my contributions result from the simple fact I log every useragent and then analyze (and truncate) that log from time to time. I omit some of them, as they seem to occur very rarily. If I refer to re-encountering a bot please note that my most recent data is from 2012-05-20 only.
Pony99CA wrote:
AmigoJack wrote:URL: http://swebot.net
This URL doesn't seem to work (at least not today). When did you see that bot?
Around 2012-03-29. Other mutations use the URL http://swebot-crawler.net instead, which is by now also either parked or suspended. Haven't re-encountered that bot since then.
Pony99CA wrote:Why have spaces in the User Agents?
Primarily to avoid false positives - that's also the reason why I always use a trailing slash (proper product tokens have the format name/version as in RFC 2616) as a word boundary. However, to also surely recognize those malformed useragents I used the blank/whitespace.
  • "Supybot " doesn't have a slash as name/version separator, but instead a blank.
    Full example: Mozilla/5.0 (Compatible; Supybot 0.83.4.1)
  • " WASALive" does neither have a slash nor a version number, so my only word boundary is the leading blank. Didn't want to include Bot as the owners might omit it in the future.
    Full example: Mozilla/5.0 (compatible; WASALive Bot ; http://blog.wasalive.com/wasalive-bots/)
  • "Download Ninja " doesn't have a slash as name/version separator, but instead a blank.
    Full example: Download Ninja 7.0
  • "Panopta " doesn't have a slash as name/version separator, but instead a blank - and the version precedes a v as if it was meant for human reading. Didn't want to include v to make it future-proof.
    Full example: Panopta v1.1
Pony99CA wrote:Supybot seems to be an IRC bot. Did you actually see this on a phpBB page
Definitly. And also re-encountered that bot at least 2012-05-14.
Pony99CA wrote:
AmigoJack wrote:Name: Aitellu
Match: Octoswarm
URL: http://www.aitellu.com
This bot was already in the script, but with a different agent (Aitellu). When did you see it?
2012-01-17 and re-encountered it at least 2012-05-20. Didn't notice it was already in the list, sorry.
Pony99CA wrote:
AmigoJack wrote:Name: URLDBCleaner
Match: URLDBCleaner/
Do you have an IP address for this?
Sadly not anymore.
Pony99CA wrote:
AmigoJack wrote:Name: SitesLikeIt
Match: SitesLikeIt/
URL: http://www.siteslikeit.com
This domain seems to be parked. I've added it, but wonder if there's a better URL to list.
Encountered it 2011-10-08 and never again since then. We should both remove this entry.
Pony99CA wrote:
AmigoJack wrote:Name: Plukkie
I'm going to call this Botje
Sure, why not.
Last edited by AmigoJack on Tue Jun 26, 2012 2:32 am, edited 1 time in total.
The worst thing about censorship is ███████████
Affin wrote:
Tue Nov 20, 2018 9:51 am
The problem is probably not my English but you do not want to understand correctly.
...
We will not come anybody anyway, nevertheless, it's best to shit this.
Pony99CA
Registered User
Posts: 4783
Joined: Thu Sep 30, 2004 3:13 pm
Location: Hollister, CA
Name: Steve
Contact:

Re: Spiders & bots to add to phpBB

Post by Pony99CA »

Thanks for the answers, Jack. For now, I plan to use agents without leading or trailing spaces. If a conflict arises, I can update the list later (that's why I wrote conflicting_bots and update_bots ;)).

Steve
Silicon Valley Pocket PC (http://www.svpocketpc.com)
Creator of manage_bots and spoof_user (ask me)
Need hosting for a small forum with full cPanel & MySQL access? Contact me or PM me.
User avatar
_Vinny_
Style Customisations
Style Customisations
Posts: 8678
Joined: Tue Aug 11, 2009 12:45 am
Location: Brazil
Name: Marcus Vinicius
Contact:

Re: Spiders & bots to add to phpBB

Post by _Vinny_ »

I see today:
OppO 1.0 ( http://www.inboundscore.com ) <= I dont know if is a bot :|

SolomonoBot/1.02 (http://www.solomono.ru)
User avatar
Marcus Wendel
Registered User
Posts: 534
Joined: Sun Mar 10, 2002 5:58 pm
Location: Sweden
Contact:

Re: Spiders & bots to add to phpBB

Post by Marcus Wendel »

Seen today.

Bot name: Panopta [Bot]
Agent match: Panopta
User agent string: checks.panopta.com
Website: http://www.panopta.com/checks/

Bot name: NerdByNature [Bot]
Agent match: NerdByNature
Mozilla/5.0 (compatible; NerdByNature.Bot; http://www.nerdbynature.net/bot)
Website: http://www.nerdbynature.net/bot
(Exist in the script)

/Marcus
Schwpz
Registered User
Posts: 335
Joined: Wed May 07, 2003 1:33 pm
Location: Planet Zot
Contact:

Re: Spiders & bots to add to phpBB

Post by Schwpz »

I was just visted by Heritrix, a spider which is already in the script, but for some reason it turned up as a guest. If anyone knows if this spider has changed its settings please let me know!

Bot name: Heritrix [Spider]
Agent match: heritrix
Mozilla/5.0 (compatible; heritrix/1.14.3 +http://www.accelobot.com)
Website: http://www.accelobot.com
Guest IP: 174.37.0.139
..:: PlanetZot.com - Your ultimate source for animation! ^^
Pony99CA
Registered User
Posts: 4783
Joined: Thu Sep 30, 2004 3:13 pm
Location: Hollister, CA
Name: Steve
Contact:

Re: Spiders & bots to add to phpBB

Post by Pony99CA »

_Vinny_ wrote:I see today:
OppO 1.0 ( http://www.inboundscore.com ) <= I dont know if is a bot :|
It appears to be a service that takes information somebody fills into a Web form and provides information about them. I would guess that you (or somebody else who works for your site) entered the site's URL or E-mail address in a contact form and that triggered the service to check the site out.

We have similar items in the bots list, so I'll add it.

Steve
Silicon Valley Pocket PC (http://www.svpocketpc.com)
Creator of manage_bots and spoof_user (ask me)
Need hosting for a small forum with full cPanel & MySQL access? Contact me or PM me.
Pony99CA
Registered User
Posts: 4783
Joined: Thu Sep 30, 2004 3:13 pm
Location: Hollister, CA
Name: Steve
Contact:

Re: Spiders & bots to add to phpBB

Post by Pony99CA »

Schwpz wrote:I was just visted by Heritrix, a spider which is already in the script, but for some reason it turned up as a guest. If anyone knows if this spider has changed its settings please let me know!

Bot name: Heritrix [Spider]
Agent match: heritrix
Mozilla/5.0 (compatible; heritrix/1.14.3 +http://www.accelobot.com)
Website: http://www.accelobot.com
Guest IP: 174.37.0.139
That's odd. Not only is Heritrix in the bot list (it's one of the pre-installed phpBB bots), but Accelobot is also in the script (User Agent match = accelobot), so one of them should have matched.

However, my script lists the name as "Heritrix [Crawler]", not "Heritrix [Spider]", which seems odd, too. I would check your board's bot list and ensure that Heritrix and/or Accelobot are shown there (and active).

Steve
Silicon Valley Pocket PC (http://www.svpocketpc.com)
Creator of manage_bots and spoof_user (ask me)
Need hosting for a small forum with full cPanel & MySQL access? Contact me or PM me.
User avatar
_Vinny_
Style Customisations
Style Customisations
Posts: 8678
Joined: Tue Aug 11, 2009 12:45 am
Location: Brazil
Name: Marcus Vinicius
Contact:

Re: Spiders & bots to add to phpBB

Post by _Vinny_ »

One more today:
Guest IP: 124.83.159.155 » Who
DoCoMo/2.0 SH902i (compatible; Y!J-SRD/1.0; http://help.yahoo.co.jp/help/jp/search/ ... ng-27.html)
Schwpz
Registered User
Posts: 335
Joined: Wed May 07, 2003 1:33 pm
Location: Planet Zot
Contact:

Re: Spiders & bots to add to phpBB

Post by Schwpz »

Bot name: CloudACL [Bot]
Agent match: CloudACL
User agent string: CloudACL/Nutch-1.4
IP: 50.16.46.4
Website: http://www.cloudacl.com/about/
..:: PlanetZot.com - Your ultimate source for animation! ^^
User avatar
Marcus Wendel
Registered User
Posts: 534
Joined: Sun Mar 10, 2002 5:58 pm
Location: Sweden
Contact:

Re: Spiders & bots to add to phpBB

Post by Marcus Wendel »

New today.

Bot name: Semiocast [Bot]
Agent match: Semiocast
User agent string: Mozilla/5.0 (compatible; Semiocast HTTP client; http://semiocast.com/)
Website: http://semiocast.com/

/Marcus
User avatar
Marcus Wendel
Registered User
Posts: 534
Joined: Sun Mar 10, 2002 5:58 pm
Location: Sweden
Contact:

Re: Spiders & bots to add to phpBB

Post by Marcus Wendel »

New today.

Bot name: Twitmunin [Bot]
Agent match: Twitmunin
User agent string: Twitmunin Crawler http://www.twitmunin.com
Website: http://www.twitmunin.com

/Marcus
User avatar
_Vinny_
Style Customisations
Style Customisations
Posts: 8678
Joined: Tue Aug 11, 2009 12:45 am
Location: Brazil
Name: Marcus Vinicius
Contact:

Re: Spiders & bots to add to phpBB

Post by _Vinny_ »

New:
Wotbox/2.01 (+http://www.wotbox.com/bot/)
User avatar
_Vinny_
Style Customisations
Style Customisations
Posts: 8678
Joined: Tue Aug 11, 2009 12:45 am
Location: Brazil
Name: Marcus Vinicius
Contact:

Re: Spiders & bots to add to phpBB

Post by _Vinny_ »

One more:
OSS-bot/0.02 (see http://michaelnielsen.org/blog/oss-bot/ or contact Michael Nielsen, mn@michaelnielsen.org)
Last edited by _Vinny_ on Wed Aug 08, 2012 4:42 pm, edited 1 time in total.
Reason: Update bot
Pony99CA
Registered User
Posts: 4783
Joined: Thu Sep 30, 2004 3:13 pm
Location: Hollister, CA
Name: Steve
Contact:

Manage_Bots 6.0 Beta Test

Post by Pony99CA »

I have created manage_bots 6.0, a major upgrade, and would like some beta testing. Please do not download the attached file unless you want to help test this.

Besides additions and updates to the bot list, there are several major changes and updates.

The major updates are:
  • The level system has been completely redone, which means that the level numbers have all changed. They are now based on the 'class" of bot (phpBB standard, major search engine, minor search engine, Web tools, etc.) instead of the bot's reporter.
  • A new flags option has been added to allow greater control over which bots are changed. Boolean logic is used instead of simple <= or >= testing. This is mutually exclusive with the level option.
  • A new reporter option has been added to allow the specified operation to only apply to bots reported by one person.
The full list of updates includes:
  • Added Flags option to control adding, deleting, activating and deactivating bots more precisely than using Level
  • Added Reporter option to control adding, deleting, activating and deactivating bots by the user who reported them
  • Changed Level parameter to be function-based; user-based bot actions can be done with the new Reporter option
  • Changed default Level parameter to 128 in conjunction with previous change
  • Added total counts to credits
  • Added information on problem reporting to help
  • Added Flipboard, Genieo, Semiocast & Twitmunin bots from Marcus Wendel
  • Added InboundScore, MySmutSearch, OSS, Solomono, Wotbox & Yahoo DoCoMo bots from _Vinny_
  • Added Aboundex, Bing Preview, Botje, CheckParams, Download Ninja, Panopta, Search Web Engine, SiteIntel, SitesLikeIt, Supybot, URLDBCleaner & WASALive bots from AmigoJack
  • Added Grapeshot bot from roBBx
  • Added CloudACL bot from Schwpz
  • Updated URL for Xaldon
  • Updated URL for NerdByNature (thanks, Marcus)
  • Updated formatting of Help command display to use standard HTML headers, added Command Format and Parameter headers and used dictionary lists for parameters
  • Simplified command line parameter processing with new check_option function
  • Moved Level parameter checking into command-specific option checking areas
  • Updated bot cache handling to do it once per command (it could be cleared twice if bots were updated)
  • Updated bot cache handling to reload cache, not just clear it
  • Replaced array index variables with true constants (thanks, AmigoJack)
  • Replaced script version number, visiting/non-visiting level number and other variables with true constants
  • Created common match_bot bot matching function
  • Fixed bots array level/credit mapping
  • Fixed bug in list_bots where reporter output had an incorrect single quote in middle of string
  • Fixed bug where using request_var in get_parameter caused errors in parameters to be missed; new check_option works around that
  • Updated debug statements to use "magic" constants (like __FUNCTION__)
  • Fixed debug formatting in list_bots
  • Fixed debug output in List Format and Number parameter processing
  • Removed unnecessary references to $config phpBB global variable
  • Removed unnecessary TRUE argument calling delete_bots
For help after uploading the script, type something like http://example.com/phpbb3/manage_bots.php?? (note the two question marks!).

I have tested this, but want additional testing before I consider the release final. Please do not post questions or bug reports here; PM me if you find any bugs or have any questions. If enough questions about something come up, I will post them here.

Steve
Attachments
manage_bots.php
Beta version of manage_bots 6.0.
(188.1 KiB) Downloaded 213 times
Silicon Valley Pocket PC (http://www.svpocketpc.com)
Creator of manage_bots and spoof_user (ask me)
Need hosting for a small forum with full cPanel & MySQL access? Contact me or PM me.
User avatar
trigger_error
I've Been Banned!
Posts: 200
Joined: Mon Feb 21, 2011 1:27 pm

Re: Spiders & bots to add to phpBB

Post by trigger_error »

good work :geek:
Post Reply

Return to “phpBB Discussion”