[DEV] phpBB spam hammer

A place for MOD Authors to post and receive feedback on MODs still in development. No MODs within this forum should be used within a live environment!
Scam Warning

[DEV] phpBB spam hammer

Postby dangerousprototypes » Tue Feb 22, 2011 11:24 am

Modification Name: phpBB new user spam hammer (or something better...)

Author: DangerousPrototypes, based on the great work of Philthy and others. All automod packages made possible by Philthy. Thanks for making it happen!

Modification Description: This MOD places simple restrictions on new users to prevent spammers from posting what they want: links. We no longer need annoying captchas, registration questions, and user moderation with this mod. We even enabled guest posting. The spams that do get through are lame and impotent without a link. See the full documentation and feature description here.

    * Disables external links and bad words in posts, messages, signatures, and profiles until a user reaches a specified age and/or number of posts
    * Own-site links are excluded from the filter, other sites can be added to a whitelist
    * Configurable list of forbidden filter words, optionally show the user the trigger term
    * Configurable unicode filters to prevent spam in languages not used in the forum
    * Prevents 'sleeper agents' by disabling posts, signatures, and profiles for old accounts with 0 posts (optional)
    * Zombie registration cleanup deletes old users with no posts (optional)
    *Log of all filter actions (optional)
    * Protections are 'automatically' disabled after the criteria are met (no moderation required)

NEW EXTREME mode:
You can hack the code to enable EXTREME mode with honeypot goodness... (not recommended for mortal SYSOPS!!!)
    *Auto-delete 0 post users who try to post a profile
    *Auto-delete 0 post users who try to post a spammy signature
    *Auto-delete 0 post users who try to post a VERY spammy post (configurable link limit)
    *All EXTREME mode actions are automatically logged

Modification Version: package 0.0.3 (includes class from SVN r753)

Screenshots:
Image

Image

Image

Demo URL: http://www.dangerousprototypes.com/forum

Modification Download:

This is a continuation of the work on the Disable links for new users mod.

Thanks to Philthy for providing the packaged version!

Here are some possible future features:
*Sliding window based on first post instead of registration date, gives admins more time to spot spammers before they strike
*Actually disable fields and submit button in user profile, scripts that submit anyways are auto-deleted :)

Suggestions and complaints accepted below.

AUTOMOD ISSUES
There are two minor automod issues:

SQL ERRORS ON UNINSTALL
*Several empty SQL errors on uninstall.
I googled but didnt see how to fix it.
You can:
1. Force uninstall anyways, it will just leave the settings in the database and they will be ignored.
2. Force uninstall and then go to phpmysql, etc, and delete the entries.

SQL ERRORS ON RE-INSTALL
*Duplicate entry errors on re-install
Related to above error: the entries are still in the settings database.
You can:
1. Force the install, your previous settings will be restored.
Last edited by dangerousprototypes on Thu Apr 14, 2011 9:15 am, edited 9 times in total.
Please do not PM or mail with questions. Ask in the forum where everyone can share the answer.
dangerousprototypes
Registered User
 
Posts: 91
Joined: Fri Feb 11, 2011 5:53 am

Re: [DEV] phpBB spam hammer

Postby darkonia » Tue Feb 22, 2011 11:45 am

looks very good, thanx for continue the work and switch it to a new level :ugeek:
MMOG-Heaven - Das Gaming Portal
Community bedeutet Gleichgesinnte finden - MMOG-Heaven ist Deine Community! Von Spielern für Spieler bietet Dir dieses Portal genau das, was ein Spieler braucht. Bleibe ständig informiert, finde die neuesten MMORPGs, oder suche die frischesten News aus der Welt des Online Gamings - dieses und vieles mehr erwartet Dich auf MMOG-Heaven.
User avatar
darkonia
Registered User
 
Posts: 1901
Joined: Tue May 13, 2008 1:10 pm
Location: Munich, Germany

Re: [DEV] phpBB spam hammer

Postby Philthy » Tue Feb 22, 2011 12:26 pm

Nice one DP ;)

My observations after testing this mod:
1: The sql insertion entered the fields in the database, but not the values. I simply entered them manually from the admin panel.
2: I found the 73 character minimum first post to be a little too restrictive, and edited it manually in the class. Maybe something that could be set in the admin panel?

Like you, I have guest posting enabled, and don't have any problems.

Here is a test forum, with almost all restrictions removed to attract spammers:
Spam the hell out of this boys !

Hopefully, this test forum will give us a few pointers ? It will get deleted once we've finished testing.

Edit to add:
I have only set three spam words to filter, the popular blue pill being one of the words. I'm hoping to harvest others from any posts that may make it through.

DP, do you want an admin login to test with?
Last edited by Philthy on Wed Apr 13, 2011 4:55 pm, edited 1 time in total.
Go on ! it's not as steep as it looks.....
Philthy
Registered User
 
Posts: 210
Joined: Tue Dec 27, 2005 10:05 am
Location: Dawlish, Devon

Re: [DEV] phpBB spam hammer

Postby heredia21 » Tue Feb 22, 2011 2:27 pm

I tried this last time. When i would hit reply it would take me to your install instructions? And when using quickreply itll let users with 0 posts post links.
Best BlackBerry website for all users! BlackBerry News - http://blackberryempire.com
User avatar
heredia21
Registered User
 
Posts: 942
Joined: Sun Apr 18, 2010 6:14 pm

Re: [DEV] phpBB spam hammer

Postby Philthy » Tue Feb 22, 2011 2:32 pm

heredia21 wrote:I tried this last time. When i would hit reply it would take me to your install instructions? And when using quickreply itll let users with 0 posts post links.


can you post a link to your forum with the rogue post in it please.
Go on ! it's not as steep as it looks.....
Philthy
Registered User
 
Posts: 210
Joined: Tue Dec 27, 2005 10:05 am
Location: Dawlish, Devon

Re: [DEV] phpBB spam hammer

Postby heredia21 » Tue Feb 22, 2011 2:37 pm

Yeah sure link to forum is www.blackberryempire.com/forum... Test account with 0 posts just posted a link here - http://www.blackberryempire.com/forum/v ... ead#unread
Best BlackBerry website for all users! BlackBerry News - http://blackberryempire.com
User avatar
heredia21
Registered User
 
Posts: 942
Joined: Sun Apr 18, 2010 6:14 pm

Re: [DEV] phpBB spam hammer

Postby John T. Folden » Wed Feb 23, 2011 2:25 am

Philthy wrote:2: I found the 73 character minimum first post to be a little too restrictive, and edited it manually in the class. Maybe something that could be set in the admin panel?


Ahh, glad you noticed that. I probably would have had a few guest posters be none too pleased with that minimum (Twitter and Facebook seems to be teaching people to keep their replies short :roll: ).

Which brings up a good question as to what options are set in the funtions_link_filter.php file but not yet accessible via the ACP? I had a look through the file and *think* I understand everything but can't be sure and quite a few people installing this mod may be even less sure than me. ;)

I'll be installing this shortly and disabling reCaptcha to see what happens. :mrgreen:

Here is a test forum, with almost all restrictions removed to attract spammers:
Spam the hell out of this boys !

Hopefully, this test forum will give us a few pointers ? It will get deleted once we've finished testing
.

I added a few posts to it earlier, pulled from my own spam cemetery...plus that long list of domain extensions. One set of extensions, I didn't think about are those used by URL shortening services.

Also, might there be a way to 'record' an entry to the user log whenever a post gets denied and the reason (or perhaps another, unique "Spam log")?
The Blue Whale Pub - SPN/SF/F TV Discussion Forum
ZOMBIE ALERT: The Walking Dead are coming to AMC!
John T. Folden
Registered User
 
Posts: 188
Joined: Tue Sep 04, 2007 12:16 am

Re: [DEV] phpBB spam hammer

Postby John T. Folden » Wed Feb 23, 2011 5:25 am

Okay, just installed this.... getting ready to give it a test... I have a few questions/comments.

*In the installation instructions for editing /includes/acp/acp_board.php part of the new code includes adding 'legend3' but there is already a 'legend3' just below the area to add the new text in question.... should the pre-existing legend3 be renamed to legend4? If so, this should be in the instructions.

*Disable sleeper agents looks to be enabled by default. Is this setting/function still active without the editing to cron.php? If true, and since it's unclear how to 're-enable' a disabled account, this should probably be disabled on a default install.

*Are all the language strings stored in functions_link_filter.php? I'd like to edit some of these but it could be tiresome if they need ed-edited every time this file is updated.
The Blue Whale Pub - SPN/SF/F TV Discussion Forum
ZOMBIE ALERT: The Walking Dead are coming to AMC!
John T. Folden
Registered User
 
Posts: 188
Joined: Tue Sep 04, 2007 12:16 am

Re: [DEV] phpBB spam hammer

Postby John T. Folden » Wed Feb 23, 2011 5:44 am

Another issue... I left the Filter Help Link empty in ACP but the antispam message in UCP/Profile is still pointing to http://dangerousprototypes.com/forum/vi ... 886#p17886
The Blue Whale Pub - SPN/SF/F TV Discussion Forum
ZOMBIE ALERT: The Walking Dead are coming to AMC!
John T. Folden
Registered User
 
Posts: 188
Joined: Tue Sep 04, 2007 12:16 am

Re: [DEV] phpBB spam hammer

Postby dangerousprototypes » Wed Feb 23, 2011 8:08 am

Thanks for the great feedback! Found some bugs, an updated class is at the end of this post, or you can get the latest version from SVN.
---
1: The sql insertion entered the fields in the database, but not the values. I simply entered them manually from the admin panel.

In my experience they do get set (in the DB), but not picked up by phpBB. Then phpBB gives a SQL error on first update, but a refresh clears it up. I copied the SQL statements from the previous mod exactly, even though I wanted to combine them to a single statement :) I don;t understand this at all, maybe there is an additional field that needs to be set, or maybe some of the text in the query is causing issues.

2: I found the 73 character minimum first post to be a little too restrictive, and edited it manually in the class. Maybe something that could be set in the admin panel?

That is a goal, but adding new ACP options is a huge pain.
I have only set three spam words to filter, the popular blue pill being one of the words. I'm hoping to harvest others from any posts that may make it through.

There is a harvested list of spam words in the documentation too.
DP, do you want an admin login to test with?

Not unless you have problems with the mod.
---
I tried this last time. When i would hit reply it would take me to your install instructions? And when using quickreply itll let users with 0 posts post links.


I can only think that the install went horribly wrong, or there is a conflicting plugin. The only place the install instructions link is located is in a comment at the top of the class. There is no code in the class that would do that, so non-code must be executing. I'll assume the quickreply issue is related until we find out what causes the other bug, quick reply is filtered without problem on my site (quick reply and regular reply are processed by the same phpBB code).
---
I added a few posts to it earlier, pulled from my own spam cemetery...plus that long list of domain extensions. One set of extensions, I didn't think about are those used by URL shortening services.


Interesting list. I noticed a few things in the honeypot:
*There are no actual links. Maybe viagrra dot com is unwanted, or spam.ly, but it's not a list of 100s of linked URLs ready for clicking and search engine indexing. I consider that a success :) We'll never get around a determined spammer, but we can make it hard, make them obvious, and keep them from getting what they want (a link). The http:// filter kills any auto-linking as far as I know. I actually don't disallow dot com, etc, in the forum because I want new users to still be able to post a link if they need to, but it doesn't do spammers much good with the search engines.
*Similar with broken words (men like viagr a for fore x marke t trading). If a spammer goes to so much effort for a such a small gain, there's not much that can be done. I actually see this as a benefit because normal users can avoid an overly aggressive filter, but spammer still don't get any SEO or click benefit.
*Some words in the subject are not filtered (may not be enabled on the test site, such as f***ing).

Also, might there be a way to 'record' an entry to the user log whenever a post gets denied and the reason (or perhaps another, unique "Spam log")?


Adding to the log is pretty easy, but every entry is a round-trip to the database which eats up cycles and space. No idea about a new log.
*In the installation instructions for editing /includes/acp/acp_board.php part of the new code includes adding 'legend3' but there is already a 'legend3' just below the area to add the new text in question.... should the pre-existing legend3 be renamed to legend4? If so, this should be in the instructions.

The legend is just a break, I don;t think the name is important, but if it is duplicated I suppose everything should be updated.
*Disable sleeper agents looks to be enabled by default. Is this setting/function still active without the editing to cron.php? If true, and since it's unclear how to 're-enable' a disabled account, this should probably be disabled on a default install.

Disable sleepers only disables accounts that have not posted at all during the minimum time window. It can be disabled from the ACP, and currently there is no default option because the setting we enter don't 'take' for some reason :)

Edits to cron.php are only for the zombie registration purge.

*Are all the language strings stored in functions_link_filter.php? I'd like to edit some of these but it could be tiresome if they need ed-edited every time this file is updated.

For the user yes, but the ACP strings are in the ACP file. I think the correct way to customize the strings is to add them to your own language file. All the text does something like this:
Code: Select all
      //If there isn't a phpbb3 sleeper agent message add one
      if (empty($user->lang['NO_SLEEPER_SPAM_FOR_YOU'])){
         $user->lang['NO_SLEEPER_SPAM_FOR_YOU']='Antispam: account disabled, please contact an admin.';
         //could delete the user automatically here
      }


So it only adds the default if one doesn't exist already. I don't know where to do this though, I don't really know anything about phpBB.

Another issue... I left the Filter Help Link empty in ACP but the antispam message in UCP/Profile is still pointing to

Thanks! I forgot the initialize the database variable before doing sleeper check and attaching some of the error messages. It is fixed in the attached class, just replace the exiting class file in /includes/.

---
It sounds like I should go ahead and add these options to the ACP:
*Add minimum first post size to ACP
*Enable/disable, configure zombie purge from ACP
---
Latest class version r737:
Code: Select all
<?php
/**
*
* functions_link_filter.php version r737
* @license http://opensource.org/licenses/gpl-license.php GNU Public License
* Modified by Ian Lesnet (http://dangerousprototypes.com)
* Documentation and install info here:
*  http://dangerousprototypes.com/docs/PhpBB3_MOD:_Disable_links_for_new_users
*
*/

/**
* @ignore
*/
if (!defined('IN_PHPBB'))
{
   exit;
}

class link_filter{

   //--settings variables--//
   // An array of no-nos. Add whatever you need...
   private $no_link_strings=array();//('http://', 'www.', '.com', '.net', '.org', '.uk', '.ly', '.me', '.ru', '.biz', '.info', 'dot com', 'dot net', 'dot org', 'dotcom', 'dotnet', 'dotorg', '_com', '_org', '_co_uk', '_ru', 'dot ru');

   //a secondary spam words filter
   private $no_word_strings=array();
   
   private $show_trigger_word=true; //show the user the word that triggered the error
   
   //URLS to ALLOW always. In additon to own-site urls
   private $whitelist_urls=array('http://code.google.com', 'http://sourceforge.net',);
   
   //a regex to filter a unicode character range.
   //Will throw an error if any of these characters are included in a post.
   //leave blank to disable this feature
   //for unicode blocks see: http://www.fileformat.info/info/unicode/block/index.htm
   //0400-04FF - Cyrillic (Russian)
   //0600-06FF - Arabic
   //3100-312F - Mandarin Bopomofo block
   //example of chaining together: '/([\x{0400}-\x{04FF}]|[\x{0600}-\x{06FF}])+/u';
   private $unicode_filter='/([\x{0400}-\x{04FF}]|[\x{0600}-\x{06FF}])+/u';
   
   //what percentage of the post must be non-unicode character (mosly useful for english language sites)
   //leave blank to disable this feature
   private $minimum_nonunicode_text=0.95; //percent of post that must be non-unicode
   
   //where can the user get more help about the filter? Could be a forum post or web page
   //if blank the link will NOT be added to the error messages
   //leave blank to no use this features
   private $help_url = 'http://dangerousprototypes.com/forum/viewtopic.php?f=2&t=1846&p=17886#p17886';
   
   private $load_values_from_db=true; //enable to use with DB, or use the values below
   private $minimum_days=1; //minimum days as member to post links
   private $minimum_posts=1; //minimum posts for member to post links
   
   private $sleeper_check=true; //users with 0 posts who try to post after minimum_days will be prohibited
   
   private $first_post_length=73;//a minimum characters for the first post. enter 0 to disable
   
   //-- Reporting variables--//
   public $filter_user=false; //we decided to filter this user (they met our criteria)
   
   public $found_stuff=false; //did we find anything? (any item below)
   public $found_sleeper=false;//did we determine this user to be a sleeper agent?
   public $found_links=false; //we found links
   public $found_words=false;//we found bad words
   public $found_unicode=false;
   
   public $error=array(); //holds text error array

   
/**
*  Test if user can have a profile yet
*  returns false if they do NOT need to be filtered
*/
function link_filter_test_profile(){
   global $user;
   
   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   
   
   if($this->link_filter_sleeper_check()) return true; //if it is a sleeper agent just return error, don;t do the check
      
   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_PROFILE_FOR_YOU']='Antispam: You can\'t have a profile yet. You need to post a few times first.';
   }
   
   $this->error[]=$user->lang['NO_PROFILE_FOR_YOU'].' '.$this->link_filter_add_help_link();
   
   return true;
}   

/**
*  Test a submitted signature for links and words
*  Returns true if bad things detected
*/
function link_filter_test_signature($signature){
   global $user;

   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   
   
   if($this->link_filter_sleeper_check()) return true; //if it is a sleeper agent just return error, don;t do the check
   
   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_LINK_FOR_YOU']='Antispam: You can\'t have off-site URLs in your sig until you post a few times. ';
   }
   //If there isn't a phpbb3 no_word message add one
   if (empty($user->lang['NO_WORD_FOR_YOU'])){
      $user->lang['NO_WORD_FOR_YOU']='Do you kiss your mom with that mouth? We don\'t want to read that! ';
   }
   
   //make a version of the post and subject
   //need the trailing space or it can hang forever in the while loop if only using a local URL
   return $this->link_filter_test(' '.trim($signature).' ');
   
}

/**
*  Test a submitted PM for links and words
*  Returns true if bad things detected
*/
function link_filter_test_pm($message, $subject){
   global $user;
   
   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   //if($this->found_sleeper) return true; //if it is a sleeper agent we still allow PMs to contact an admin, but we still filter them

   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   

   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_LINK_FOR_YOU']='Your message looks too spamy for a new user, please remove off-site URLs.';
   }
   //If there isn't a phpbb3 no_word message add one
   if (empty($user->lang['NO_WORD_FOR_YOU'])){
      $user->lang['NO_WORD_FOR_YOU']='Your message looks too spamy for a new user, please remove bad words or non-english text.';
   }
   
   //make a version of the post and subject
   return $this->link_filter_test(' '.trim($message.' '.$subject).' ');
}

/**
*  Test a submitted post for links and words
*  Returns true if bad things detected
*/
function link_filter_test_post($message, $subject){
   global $user;
   
   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   //need to do this here or we don;t get the site's unique help URL
   //better to do it here and then again below that do it for every user before the check
   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   
   
   if( (($user->data['user_posts']==0)||($user->data['user_type']==USER_IGNORE)||($user->data['user_id']==ANONYMOUS))&& (strlen($message)<$this->first_post_length)){//first post, check length
      
      //If there isn't a phpbb3 message add one
      if (empty($user->lang['NO_LINK_TOO_SHORT'])){
         $user->lang['NO_LINK_TOO_SHORT']='Antispam: Sorry, your first post needs to be just a little longer.';
      }
      $this->error[]=$user->lang['NO_LINK_TOO_SHORT'].' '.$this->link_filter_add_help_link();
      return true;
   }
   
   if($this->link_filter_sleeper_check()) return true; //if it is a sleeper agent just return error, don;t do the check

   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_LINK_FOR_YOU']='Your post looks too spamy for a new user, please remove off-site URLs.';
   }
   //If there isn't a phpbb3 no_word message add one
   if (empty($user->lang['NO_WORD_FOR_YOU'])){
      $user->lang['NO_WORD_FOR_YOU']='Your post looks too spamy for a new user, please remove bad words or non-english text.';
   }
   
   //make a version of the post and subject
   return $this->link_filter_test(' '.trim($message.' '.$subject).' ');
}

/**
*    Do we need to check this user?
*
*/
function link_filter_check()
{
   global $user, $config;

   if($this->load_values_from_db){ //use MOD setting from database
      $this->minimum_days=$config['links_after_num_days'];
      $this->minimum_posts=$config['links_after_num_posts'];
      if($config['links_disable_sleepers']=='1'){
         $this->sleeper_check=true;
      }else{
         $this->sleeper_check=false;
      }
   }

   //check if the user meets filter criteria
   $this->filter_user=((!$user->data['session_admin']) && (($user->data['user_type']==USER_IGNORE) || ($user->data['user_id']==ANONYMOUS) || ($user->data['user_posts']<=($this->minimum_posts-1)) || ($user->data['user_regdate']>((time())-(86400*$this->minimum_days)))));

   //this MIGHT be used in a bigger filter to only apply to new users.
   //$this->filter_user=((!$user->data['session_admin']) && (($user->data['user_type']==USER_IGNORE) || ($user->data['user_id']==ANONYMOUS) || ($user->data['user_new']==1 &&(($user->data['user_posts']<=($this->minimum_posts-1)) || ($user->data['user_regdate']>((time())-(86400*$this->minimum_days)))))));
   
   //If you're not special, we filter you
   return $this->filter_user;
}

/**
*    check for spammers who come back later and  try to post (sleeper agents, or aged accounts)
*
*/
function link_filter_sleeper_check(){
   global $user;
   
   //check for spammers who come back later and  try to post (sleeper agents, or aged accounts)
   if(($this->sleeper_check) && ($user->data['user_id']!=ANONYMOUS) && ($user->data['user_posts']==0) && ($user->data['user_regdate']<=((time())-(86400*$this->minimum_days)))){
      $this->found_sleeper=$this->found_stuff=true;
      
      //If there isn't a phpbb3 sleeper agent message add one
      if (empty($user->lang['NO_SLEEPER_SPAM_FOR_YOU'])){
         $user->lang['NO_SLEEPER_SPAM_FOR_YOU']='Antispam: account disabled, please contact an admin.';
         //could delete the user automatically here
      }
      $this->error[]=$user->lang['NO_SLEEPER_SPAM_FOR_YOU'].' '.$this->link_filter_add_help_link();
      return true;
   }
   return false;
}

/**
*    Add a help link if it exists

*/
function link_filter_add_help_link(){
   if(!empty($this->help_url)){
   
      //If there isn't a phpbb3 message add one
      if (empty($user->lang['HELP_LINK'])){
         $user->lang['HELP_LINK']='Click for help';
      }
      
      return '<a href="'.$this->help_url.'">'.$user->lang['HELP_LINK'].'</a>.';
   }
}

/**
*    if database is enabled this will explore the filter lists into the arrays
*  We move it here so this only happen when we need to filter the user
*/
function link_filter_load_list_from_db(){
   global $config;
   //use MOD setting from database
   $this->whitelist_urls=explode(",", $config['links_allow_always']);
   $this->no_link_strings=explode(",", $config['links_link_strings']);
   $this->no_word_strings=explode(",", $config['links_word_strings']);      
   $this->whitelist_urls=explode(",", $config['links_allow_always']);      
   $this->unicode_filter=$config['links_unicode_filter'];
   $this->minimum_nonunicode_text=(float)$config['links_nonunicode_percent'];
   $this->help_url = $config['links_help_url'];
}

/**
*    Search the text for forbidden URLs and text.
*   Add an error to the local error array if found
*   returns true for bad stuff, false for no flags found
*
*/
function link_filter_test($no_link_message){
   global $user, $config;

   //filter the looozers
   //remove line feeds and stuff
   $no_link_message=str_replace('\n', ' ',$no_link_message);
   $no_link_message=str_replace('\r', ' ', $no_link_message);
   //replace double spaces with single spaces (not sure why, white space?)
   while (strpos($no_link_message, '  ')){
      $no_link_message=str_replace('  ', ' ', $no_link_message);
   }
   
   //remove any own-site references, these are ok
   //first change http://mysite.com to mysite.com so we only have to look once below
   $no_link_message=str_replace($config['server_protocol'].$config['server_name'], $config['server_name'], $no_link_message);

   //whitelist other common domains too
   //we do this by relacing them with our own domain so we only have to run the search once below
   for ($x=0;$x<sizeof($this->whitelist_urls);$x++){
      if(stripos($no_link_message, $this->whitelist_urls[$x])){
         $no_link_message=str_ireplace($this->whitelist_urls[$x], $config['server_name'], $no_link_message);   
      }
   }
   
   //look at all instances of mysite.com
   while ($ok_start=stripos($no_link_message, $config['server_name'])){ //start of mysite.com
      $ok_end=strpos($no_link_message, '[', $ok_start); //find next [ (bbcode?)
      if (!$ok_end){ //if not bbcode
         $ok_end=strpos($no_link_message, ' ', $ok_start); //end is position of next space
      }
      if ($ok_end){
         $no_link_message=substr($no_link_message, 0, $ok_start).substr($no_link_message, $ok_end);//remove own URL
      }
   }
      

   //search for each link element, throw an error when found
   for ($x=0;$x<sizeof($this->no_link_strings);$x++){
      if (stripos($no_link_message, $this->no_link_strings[$x])){
         
         $this->found_links=$this->found_stuff=true;
                  
         $this->error[]=$user->lang['NO_LINK_FOR_YOU'].' '.$this->link_filter_add_help_link();
         
         //$x=sizeof($no_link_strings);
         break;//no reason to go further
      }
   }
   
   //search each word, throw an error when found
   for ($x=0;$x<sizeof($this->no_word_strings);$x++){
      if (stripos($no_link_message, $this->no_word_strings[$x])){
      
         $this->found_words=$this->found_stuff=true;
      
         if($this->show_trigger_word){//show the cause so the user isn't stumped
            $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' ('.$this->no_word_strings[$x].') '.$this->link_filter_add_help_link();
         }else{//don't show the cause
            $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' '.$this->link_filter_add_help_link();
         }
         
         //$x=sizeof($no_link_strings);
         break;//no reason to search further
      }
   }
   
   if($this->found_stuff==true) return true; //don't continue checking below
   
   //make a smaller subset for preg_match to save time and cycles
   if(strlen($no_link_message) > 512){
      $no_link_message = substr($no_link_message, 0, 512);
   }
   
   //Check for unicode characters that don't belong in the language of this forum   
   if(!empty($this->unicode_filter)){
      if(preg_match($this->unicode_filter, $no_link_message, $m)==1){ //test for unicode character ranged defined by user
      
         $this->found_unicode=$this->found_stuff=true;

         $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' '.$this->link_filter_add_help_link();
         
         return true; //don't continue checking below
      }
   }
   
   //check the percentage of the text that is NOT unicode
   //see http://www.mawhorter.net/web-development/easily-detecting-if-a-block-of-text-is-written-in-english-non-unicode-languages
   if((!empty($this->minimum_nonunicode_text))){

      //found the length in unicode mode
      $ulen = preg_match_all("#.#u", $no_link_message, $m);
      //find length without unicode
      $len  = preg_match_all('#.#', $no_link_message, $m);
      
      //determine if % of non-unicode is enough
      if(($ulen/$len)<(float)$this->minimum_nonunicode_text){
         $this->found_unicode=$this->found_stuff=true;
         $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' ('.$ulen.'/'.$len.') '.$this->link_filter_add_help_link();
      }   
   }

   return $this->found_stuff;

}

function link_filter_purge_zombies(){
   global $db, $config;
   
   if($this->load_values_from_db){ //use MOD setting from database
      $this->minimum_days=$config['links_after_num_days'];
   }
   
   if($this->minimum_days<1) return; //don't delete if there is no days setting

   // Get bot ids
   $sql = 'SELECT user_id
      FROM ' . BOTS_TABLE;
   $result = $db->sql_query($sql);

   $bot_ids = array();
   while ($row = $db->sql_fetchrow($result))
   {
      $bot_ids[] = $row['user_id'];
   }
   $db->sql_freeresult($result);

   // Select the group of users to delete
   $sql = 'SELECT user_id, username
      FROM ' . USERS_TABLE . '
      WHERE user_id <> ' . ANONYMOUS . '
         AND user_type <> ' . USER_FOUNDER .'
         AND user_regdate < ' . gmmktime(0, 0, 0, date("m"), (date("d")-$this->minimum_days), date("Y")).'
         AND user_posts = 0';
   $result = $db->sql_query($sql);

   $usernames = array();

   //this would be simpler with a SQL statment, but recycling the ppBB prune functon makes it more robust
   while ($row = $db->sql_fetchrow($result))
   {

      // Do not prune bots and the user currently pruning.
      if (!in_array($row['user_id'], $bot_ids))
      {
         user_delete('remove', $row['user_id']);//delete the user and all posts (there should be none though)
         //user_delete('retain', $row['user_id'],$row['username']); //delete users but not posts, safer just in case
         $usernames[]=$row['username'];//keep the list for the log
      }
   }
   $db->sql_freeresult($result);
   
   //add log message
   add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'Zombie registration cleanup by Disable links for new users MOD:'.implode(', ', $usernames));

}

}//class

?>
Please do not PM or mail with questions. Ask in the forum where everyone can share the answer.
dangerousprototypes
Registered User
 
Posts: 91
Joined: Fri Feb 11, 2011 5:53 am

Re: [DEV] phpBB spam hammer

Postby dangerousprototypes » Wed Feb 23, 2011 12:52 pm

I tried to pull in all the requested features and bug fixes at once. The class below is updated and the wiki has updated install instructions.

The logging is really interesting. You can see who is a bot and who is a real person fighting with the filter. The automated profile post is obvious in the double log entry. I think removing the submit button and deleting users who post anyways would be a great spam trap.

*In the installation instructions for editing /includes/acp/acp_board.php part of the new code includes adding 'legend3' but there is already a 'legend3' just below the area to add the new text in question.... should the pre-existing legend3 be renamed to legend4? If so, this should be in the instructions.

I had actually updated this in my own code. I changed the instruction to "find and replace", and included the update to make the submit button legend4.

Bug fix:
*Load DB values too late
*help URL error
*ACP additions corrected so submit is legend4
*moved disable sleeper agent option under min days setting to be more obvious

Features:
*verbose logging of all actions, even approvals (for new users)
*ACP: toggle zombie purge
*ACP: set zombie purge days
*ACP: set minimum first post length (or none)
*ACP: toggle logging

Updates require:
*New SQL field
*edits to acp_board.php, board.php (add new labels and fields)
*copy new functions_link_filter.php

Code: Select all
<?php
/**
*
* functions_link_filter.php version r739
* @license http://opensource.org/licenses/gpl-license.php GNU Public License
* Modified by Ian Lesnet (http://dangerousprototypes.com)
* Documentation and install info here:
*  http://dangerousprototypes.com/docs/PhpBB3_MOD:_Disable_links_for_new_users
*
*/

/**
* @ignore
*/
if (!defined('IN_PHPBB'))
{
   exit;
}

class link_filter{

   //--settings variables--//
   // An array of no-nos. Add whatever you need...
   private $no_link_strings=array();//('http://', 'www.', '.com', '.net', '.org', '.uk', '.ly', '.me', '.ru', '.biz', '.info', 'dot com', 'dot net', 'dot org', 'dotcom', 'dotnet', 'dotorg', '_com', '_org', '_co_uk', '_ru', 'dot ru');

   //a secondary spam words filter
   private $no_word_strings=array();
   
   private $show_trigger_word=true; //show the user the word that triggered the error
   
   //URLS to ALLOW always. In additon to own-site urls
   private $whitelist_urls=array('http://code.google.com', 'http://sourceforge.net',);
   
   //a regex to filter a unicode character range.
   //Will throw an error if any of these characters are included in a post.
   //leave blank to disable this feature
   //for unicode blocks see: http://www.fileformat.info/info/unicode/block/index.htm
   //0400-04FF - Cyrillic (Russian)
   //0600-06FF - Arabic
   //3100-312F - Mandarin Bopomofo block
   //example of chaining together: '/([\x{0400}-\x{04FF}]|[\x{0600}-\x{06FF}])+/u';
   private $unicode_filter='/([\x{0400}-\x{04FF}]|[\x{0600}-\x{06FF}])+/u';
   
   //what percentage of the post must be non-unicode character (mosly useful for english language sites)
   //leave blank to disable this feature
   private $minimum_nonunicode_text=0.95; //percent of post that must be non-unicode
   
   //where can the user get more help about the filter? Could be a forum post or web page
   //if blank the link will NOT be added to the error messages
   //leave blank to no use this features
   private $help_url = 'http://dangerousprototypes.com/forum/viewtopic.php?f=2&t=1846&p=17886#p17886';
   
   private $load_values_from_db=true; //enable to use with DB, or use the values below
   private $minimum_days=1; //minimum days as member to post links
   private $minimum_posts=1; //minimum posts for member to post links
   
   private $sleeper_check=true; //users with 0 posts who try to post after minimum_days will be prohibited
   
   private $first_post_length=73;//a minimum characters for the first post. enter 0 to disable
   
   private $log_activity=false;//log entry for all activity (not recomended)
   
   //-- Reporting variables--//
   public $filter_user=false; //we decided to filter this user (they met our criteria)
   
   public $found_stuff=false; //did we find anything? (any item below)
   public $found_sleeper=false;//did we determine this user to be a sleeper agent?
   public $found_links=false; //we found links
   public $found_words=false;//we found bad words
   public $found_minwords=false;//we found too few words
   public $found_unicode=false;//found unicode
   public $found_profile=false;//profiles not allow check positive (for log)
   public $error=array(); //holds text error array

   /*
   *   Log filter actions
   *
   */
   function link_add_log($type,$no_link_message){
      global $user;
      $l='Checked '.$type.' for \''.$user->data['username'].'\' ';
      if($this->found_stuff){
         $l.='DETECTED: ';
         if($this->found_sleeper)$l.='sleeper agent, ';
         if($this->found_links)$l.='links, ';
         if($this->found_words)$l.='bad words, ';
         if($this->found_minwords)$l.='too few words, ';
         if($this->found_unicode)$l.='unicode, ';
         if($this->found_profile)$l.='profile disabled, ';
         $l.='ERRORS: '.implode(', ', $this->error);
         if(!empty($no_link_message)) $l.=' CONTENTS: '.$no_link_message; 
      }else{
         $l.='OK';
      }
      add_log('admin', 'LOG_SPAM_HAMMER', 'spam hammer MOD: '.$l);
   }
   
/**
*  Test if user can have a profile yet
*  returns false if they do NOT need to be filtered
*/
function link_filter_test_profile(){
   global $user;
   
   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   
   
   if(!$this->link_filter_sleeper_check()){ //if it is a sleeper agent just return error, don;t do the check
         
      //If there isn't a phpbb3 no_link message add one
      if (empty($user->lang['NO_LINK_FOR_YOU'])){
         $user->lang['NO_PROFILE_FOR_YOU']='Antispam: You can\'t have a profile yet. You need to post a few times first.';
      }
      
      $this->error[]=$user->lang['NO_PROFILE_FOR_YOU'].' '.$this->link_filter_add_help_link();
   }
   
   $this->found_stuff=$this->found_profile=true;
   
   if($this->log_activity) $this->link_add_log('PROFILE','');
   
   return true;
}   

/**
*  Test a submitted signature for links and words
*  Returns true if bad things detected
*/
function link_filter_test_signature($signature){
   global $user;

   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   
   
   if($this->link_filter_sleeper_check()){
      if($this->log_activity) $this->link_add_log('SIGNATURE',$signature);
      return true; //if it is a sleeper agent just return error, don;t do the check
   }
   
   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_LINK_FOR_YOU']='Antispam: You can\'t have off-site URLs in your sig until you post a few times. ';
   }
   //If there isn't a phpbb3 no_word message add one
   if (empty($user->lang['NO_WORD_FOR_YOU'])){
      $user->lang['NO_WORD_FOR_YOU']='Do you kiss your mom with that mouth? We don\'t want to read that! ';
   }
   
   //make a version of the post and subject
   //need the trailing space or it can hang forever in the while loop if only using a local URL
   $res=$this->link_filter_test(' '.trim($signature).' ');
   
   if($this->log_activity) $this->link_add_log('SIGNATURE',$signature);
   
   return $res;
   
}

/**
*  Test a submitted PM for links and words
*  Returns true if bad things detected
*/
function link_filter_test_pm($message, $subject){
   global $user;
   
   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   //if($this->found_sleeper) return true; //if it is a sleeper agent we still allow PMs to contact an admin, but we still filter them

   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   

   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_LINK_FOR_YOU']='Your message looks too spamy for a new user, please remove off-site URLs.';
   }
   //If there isn't a phpbb3 no_word message add one
   if (empty($user->lang['NO_WORD_FOR_YOU'])){
      $user->lang['NO_WORD_FOR_YOU']='Your message looks too spamy for a new user, please remove bad words or non-english text.';
   }
   
   //make a version of the post and subject
   $res=$this->link_filter_test(' '.trim($message.' '.$subject).' ');
   
   if($this->log_activity) $this->link_add_log('PM',$subject.''.$message);
   
   return $res;   
   
}

/**
*  Test a submitted post for links and words
*  Returns true if bad things detected
*/
function link_filter_test_post($message, $subject){
   global $user;
   
   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   //need to do this here or we don;t get the site's unique help URL
   //better to do it here and then again below that do it for every user before the check
   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   
   
   if( (($user->data['user_posts']==0)||($user->data['user_type']==USER_IGNORE)||($user->data['user_id']==ANONYMOUS))&& (strlen($message)<$this->first_post_length)){//first post, check length
      
      //If there isn't a phpbb3 message add one
      if (empty($user->lang['NO_LINK_TOO_SHORT'])){
         $user->lang['NO_LINK_TOO_SHORT']='Antispam: Sorry, your first post needs to be just a little longer.';
      }
      $this->error[]=$user->lang['NO_LINK_TOO_SHORT'].' '.$this->link_filter_add_help_link();
      $this->found_stuff=$this->found_minwords=true; //flag the error
      
      if($this->log_activity) $this->link_add_log('POST',$subject.''.$message);
      
      return true;
   }
   
   if($this->link_filter_sleeper_check()){
      if($this->log_activity) $this->link_add_log('POST',$subject.''.$message);
      return true; //if it is a sleeper agent just return error, don;t do the check
   }

   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_LINK_FOR_YOU']='Your post looks too spamy for a new user, please remove off-site URLs.';
   }
   //If there isn't a phpbb3 no_word message add one
   if (empty($user->lang['NO_WORD_FOR_YOU'])){
      $user->lang['NO_WORD_FOR_YOU']='Your post looks too spamy for a new user, please remove bad words or non-english text.';
   }
   
   //make a version of the post and subject
   $res=$this->link_filter_test(' '.trim($message.' '.$subject).' ');
   
   if($this->log_activity) $this->link_add_log('POST',$subject.''.$message);
   
   return $res;   
}

/**
*    Do we need to check this user?
*
*/
function link_filter_check()
{
   global $user, $config;

   if($this->load_values_from_db){ //use MOD setting from database
      $this->minimum_days=$config['links_after_num_days'];
      $this->minimum_posts=$config['links_after_num_posts'];
      if($config['links_disable_sleepers']=='1'){
         $this->sleeper_check=true;
      }else{
         $this->sleeper_check=false;
      }
   }

   //check if the user meets filter criteria
   $this->filter_user=((!$user->data['session_admin']) && (($user->data['user_type']==USER_IGNORE) || ($user->data['user_id']==ANONYMOUS) || ($user->data['user_posts']<=($this->minimum_posts-1)) || ($user->data['user_regdate']>((time())-(86400*$this->minimum_days)))));

   //this MIGHT be used in a bigger filter to only apply to new users.
   //$this->filter_user=((!$user->data['session_admin']) && (($user->data['user_type']==USER_IGNORE) || ($user->data['user_id']==ANONYMOUS) || ($user->data['user_new']==1 &&(($user->data['user_posts']<=($this->minimum_posts-1)) || ($user->data['user_regdate']>((time())-(86400*$this->minimum_days)))))));
   
   //If you're not special, we filter you
   return $this->filter_user;
}

/**
*    check for spammers who come back later and  try to post (sleeper agents, or aged accounts)
*
*/
function link_filter_sleeper_check(){
   global $user;
   
   //check for spammers who come back later and  try to post (sleeper agents, or aged accounts)
   if(($this->sleeper_check) && ($user->data['user_id']!=ANONYMOUS) && ($user->data['user_posts']==0) && ($user->data['user_regdate']<=((time())-(86400*$this->minimum_days)))){
      $this->found_sleeper=$this->found_stuff=true;
      
      //If there isn't a phpbb3 sleeper agent message add one
      if (empty($user->lang['NO_SLEEPER_SPAM_FOR_YOU'])){
         $user->lang['NO_SLEEPER_SPAM_FOR_YOU']='Antispam: account disabled, please contact an admin.';
         //could delete the user automatically here
      }
      $this->error[]=$user->lang['NO_SLEEPER_SPAM_FOR_YOU'].' '.$this->link_filter_add_help_link();
      return true;
   }
   return false;
}

/**
*    Add a help link if it exists

*/
function link_filter_add_help_link(){
   if(!empty($this->help_url)){
   
      //If there isn't a phpbb3 message add one
      if (empty($user->lang['HELP_LINK'])){
         $user->lang['HELP_LINK']='Click for help';
      }
      
      return '<a href="'.$this->help_url.'">'.$user->lang['HELP_LINK'].'</a>.';
   }
}

/**
*    if database is enabled this will explore the filter lists into the arrays
*  We move it here so this only happen when we need to filter the user
*/
function link_filter_load_list_from_db(){
   global $config;
   //use MOD setting from database
   $this->whitelist_urls=explode(",", $config['links_allow_always']);
   $this->no_link_strings=explode(",", $config['links_link_strings']);
   $this->no_word_strings=explode(",", $config['links_word_strings']);      
   $this->whitelist_urls=explode(",", $config['links_allow_always']);      
   $this->unicode_filter=$config['links_unicode_filter'];
   $this->minimum_nonunicode_text=(float)$config['links_nonunicode_percent'];
   $this->help_url = $config['links_help_url'];
   $this->first_post_length=$config['links_first_post_words'];
   if($config['links_log_activity']=='1'){
      $this->log_activity=true;
   }else{
      $this->log_activity=false;      
   }
}

/**
*    Search the text for forbidden URLs and text.
*   Add an error to the local error array if found
*   returns true for bad stuff, false for no flags found
*
*/
function link_filter_test($no_link_message){
   global $user, $config;

   //filter the looozers
   //remove line feeds and stuff
   $no_link_message=str_replace('\n', ' ',$no_link_message);
   $no_link_message=str_replace('\r', ' ', $no_link_message);
   //replace double spaces with single spaces (not sure why, white space?)
   while (strpos($no_link_message, '  ')){
      $no_link_message=str_replace('  ', ' ', $no_link_message);
   }
   
   //remove any own-site references, these are ok
   //first change http://mysite.com to mysite.com so we only have to look once below
   $no_link_message=str_replace($config['server_protocol'].$config['server_name'], $config['server_name'], $no_link_message);

   //whitelist other common domains too
   //we do this by relacing them with our own domain so we only have to run the search once below
   for ($x=0;$x<sizeof($this->whitelist_urls);$x++){
      if(stripos($no_link_message, $this->whitelist_urls[$x])){
         $no_link_message=str_ireplace($this->whitelist_urls[$x], $config['server_name'], $no_link_message);   
      }
   }
   
   //look at all instances of mysite.com
   while ($ok_start=stripos($no_link_message, $config['server_name'])){ //start of mysite.com
      $ok_end=strpos($no_link_message, '[', $ok_start); //find next [ (bbcode?)
      if (!$ok_end){ //if not bbcode
         $ok_end=strpos($no_link_message, ' ', $ok_start); //end is position of next space
      }
      if ($ok_end){
         $no_link_message=substr($no_link_message, 0, $ok_start).substr($no_link_message, $ok_end);//remove own URL
      }
   }
      

   //search for each link element, throw an error when found
   for ($x=0;$x<sizeof($this->no_link_strings);$x++){
      if (stripos($no_link_message, $this->no_link_strings[$x])){
         
         $this->found_links=$this->found_stuff=true;
                  
         $this->error[]=$user->lang['NO_LINK_FOR_YOU'].' '.$this->link_filter_add_help_link();
         
         //$x=sizeof($no_link_strings);
         break;//no reason to go further
      }
   }
   
   //search each word, throw an error when found
   for ($x=0;$x<sizeof($this->no_word_strings);$x++){
      if (stripos($no_link_message, $this->no_word_strings[$x])){
      
         $this->found_words=$this->found_stuff=true;
      
         if($this->show_trigger_word){//show the cause so the user isn't stumped
            $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' ('.$this->no_word_strings[$x].') '.$this->link_filter_add_help_link();
         }else{//don't show the cause
            $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' '.$this->link_filter_add_help_link();
         }
         
         //$x=sizeof($no_link_strings);
         break;//no reason to search further
      }
   }
   
   if($this->found_stuff==true) return true; //don't continue checking below
   
   //make a smaller subset for preg_match to save time and cycles
   if(strlen($no_link_message) > 512){
      $no_link_message = substr($no_link_message, 0, 512);
   }
   
   //Check for unicode characters that don't belong in the language of this forum   
   if(!empty($this->unicode_filter)){
      if(preg_match($this->unicode_filter, $no_link_message, $m)==1){ //test for unicode character ranged defined by user
      
         $this->found_unicode=$this->found_stuff=true;

         $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' '.$this->link_filter_add_help_link();
         
         return true; //don't continue checking below
      }
   }
   
   //check the percentage of the text that is NOT unicode
   //see http://www.mawhorter.net/web-development/easily-detecting-if-a-block-of-text-is-written-in-english-non-unicode-languages
   if((!empty($this->minimum_nonunicode_text))){

      //found the length in unicode mode
      $ulen = preg_match_all("#.#u", $no_link_message, $m);
      //find length without unicode
      $len  = preg_match_all('#.#', $no_link_message, $m);
      
      //determine if % of non-unicode is enough
      if(($ulen/$len)<(float)$this->minimum_nonunicode_text){
         $this->found_unicode=$this->found_stuff=true;
         $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' ('.$ulen.'/'.$len.') '.$this->link_filter_add_help_link();
      }   
   }

   return $this->found_stuff;

}
   //add log message   

   //add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'Zombie registration cleanup by Disable links for new users MOD:'.implode(', ', $usernames));
function link_filter_purge_zombies(){
   global $db, $config;
   
   if($this->load_values_from_db){ //use MOD setting from database
      if($config['links_delete_zombies']!='1'){
         if($config['links_log_denials']=='1'){
            add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'Zombie registration cleanup by spam hammer MOD: (disabled)');
         }
         return; //honor ACP setting
      }
      
      $this->minimum_days=$config['links_delete_zombies_days']; //$config['links_after_num_days'];
   }
   
   if($this->minimum_days<1){
      if($config['links_log_denials']=='1'){
         add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'Zombie registration cleanup by spam hammer MOD: ERROR - days set to 0!');
      }
      return; //don't delete if there is no days setting
   }

   // Get bot ids
   $sql = 'SELECT user_id
      FROM ' . BOTS_TABLE;
   $result = $db->sql_query($sql);

   $bot_ids = array();
   while ($row = $db->sql_fetchrow($result))
   {
      $bot_ids[] = $row['user_id'];
   }
   $db->sql_freeresult($result);

   // Select the group of users to delete
   $sql = 'SELECT user_id, username
      FROM ' . USERS_TABLE . '
      WHERE user_id <> ' . ANONYMOUS . '
         AND user_type <> ' . USER_FOUNDER .'
         AND user_regdate < ' . gmmktime(0, 0, 0, date("m"), (date("d")-$this->minimum_days), date("Y")).'
         AND user_posts = 0';
   $result = $db->sql_query($sql);

   $usernames = array();

   //this would be simpler with a SQL statment, but recycling the ppBB prune functon makes it more robust
   while ($row = $db->sql_fetchrow($result))
   {

      // Do not prune bots and the user currently pruning.
      if (!in_array($row['user_id'], $bot_ids))
      {
         user_delete('remove', $row['user_id']);//delete the user and all posts (there should be none though)
         //user_delete('retain', $row['user_id'],$row['username']); //delete users but not posts, safer just in case
         $usernames[]=$row['username'];//keep the list for the log
      }
   }
   $db->sql_freeresult($result);
   
   //add log message
   add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'Zombie registration cleanup by spam hammer MOD:'.implode(', ', $usernames));

}

}//class

?>
Please do not PM or mail with questions. Ask in the forum where everyone can share the answer.
dangerousprototypes
Registered User
 
Posts: 91
Joined: Fri Feb 11, 2011 5:53 am

Re: [DEV] phpBB spam hammer

Postby dangerousprototypes » Wed Feb 23, 2011 1:22 pm

Minor cleanup:
*changed error log display
*moved user filter messages to user log
*user purge messages are still in admin log

99% tested, probably the final version.

Code: Select all
<?php
/**
*
* functions_link_filter.php version r743
* @license http://opensource.org/licenses/gpl-license.php GNU Public License
* Modified by Ian Lesnet (http://dangerousprototypes.com)
* Documentation and install info here:
*  http://dangerousprototypes.com/docs/PhpBB3_MOD:_Disable_links_for_new_users
*
*/

/**
* @ignore
*/
if (!defined('IN_PHPBB'))
{
   exit;
}

class link_filter{

   //--settings variables--//
   // An array of no-nos. Add whatever you need...
   private $no_link_strings=array();//('http://', 'www.', '.com', '.net', '.org', '.uk', '.ly', '.me', '.ru', '.biz', '.info', 'dot com', 'dot net', 'dot org', 'dotcom', 'dotnet', 'dotorg', '_com', '_org', '_co_uk', '_ru', 'dot ru');

   //a secondary spam words filter
   private $no_word_strings=array();
   
   private $show_trigger_word=true; //show the user the word that triggered the error
   
   //URLS to ALLOW always. In additon to own-site urls
   private $whitelist_urls=array('http://code.google.com', 'http://sourceforge.net',);
   
   //a regex to filter a unicode character range.
   //Will throw an error if any of these characters are included in a post.
   //leave blank to disable this feature
   //for unicode blocks see: http://www.fileformat.info/info/unicode/block/index.htm
   //0400-04FF - Cyrillic (Russian)
   //0600-06FF - Arabic
   //3100-312F - Mandarin Bopomofo block
   //example of chaining together: '/([\x{0400}-\x{04FF}]|[\x{0600}-\x{06FF}])+/u';
   private $unicode_filter='/([\x{0400}-\x{04FF}]|[\x{0600}-\x{06FF}])+/u';
   
   //what percentage of the post must be non-unicode character (mosly useful for english language sites)
   //leave blank to disable this feature
   private $minimum_nonunicode_text=0.95; //percent of post that must be non-unicode
   
   //where can the user get more help about the filter? Could be a forum post or web page
   //if blank the link will NOT be added to the error messages
   //leave blank to no use this features
   private $help_url = 'http://dangerousprototypes.com/forum/viewtopic.php?f=2&t=1846&p=17886#p17886';
   
   private $load_values_from_db=true; //enable to use with DB, or use the values below
   private $minimum_days=1; //minimum days as member to post links
   private $minimum_posts=1; //minimum posts for member to post links
   
   private $sleeper_check=true; //users with 0 posts who try to post after minimum_days will be prohibited
   
   private $first_post_length=73;//a minimum characters for the first post. enter 0 to disable
   
   private $log_activity=false;//log entry for all activity (not recomended)
   
   //-- Reporting variables--//
   public $filter_user=false; //we decided to filter this user (they met our criteria)
   
   public $found_stuff=false; //did we find anything? (any item below)
   public $found_sleeper=false;//did we determine this user to be a sleeper agent?
   public $found_links=false; //we found links
   public $found_words=false;//we found bad words
   public $found_minwords=false;//we found too few words
   public $found_unicode=false;//found unicode
   public $found_profile=false;//profiles not allow check positive (for log)
   public $error=array(); //holds text error array

   /*
   *   Log filter actions
   *
   */
   function link_add_log($type,$no_link_message){
      global $user;
      $l='Checked '.$type.' for \''.$user->data['username'].'\' ';
      if($this->found_stuff){
         $l.='DETECTED: ';
         if($this->found_sleeper)$l.='sleeper agent, ';
         if($this->found_links)$l.='links, ';
         if($this->found_words)$l.='bad words, ';
         if($this->found_minwords)$l.='too few words, ';
         if($this->found_unicode)$l.='unicode, ';
         if($this->found_profile)$l.='profile disabled, ';
         $l.='ERRORS: '.implode(', ', $this->error);
         if(!empty($no_link_message)) $l.=' CONTENTS: '.$no_link_message; 
      }else{
         $l.='OK';
      }
      add_log('user', 'LOG_SPAM_HAMMER', $l);
   }
   
/**
*  Test if user can have a profile yet
*  returns false if they do NOT need to be filtered
*/
function link_filter_test_profile(){
   global $user;
   
   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   
   
   if(!$this->link_filter_sleeper_check()){ //if it is a sleeper agent just return error, don;t do the check
         
      //If there isn't a phpbb3 no_link message add one
      if (empty($user->lang['NO_LINK_FOR_YOU'])){
         $user->lang['NO_PROFILE_FOR_YOU']='Antispam: You can\'t have a profile yet. You need to post a few times first.';
      }
      
      $this->error[]=$user->lang['NO_PROFILE_FOR_YOU'].' '.$this->link_filter_add_help_link();
   }
   
   $this->found_stuff=$this->found_profile=true;
   
   if($this->log_activity) $this->link_add_log('PROFILE','');
   
   return true;
}   

/**
*  Test a submitted signature for links and words
*  Returns true if bad things detected
*/
function link_filter_test_signature($signature){
   global $user;

   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   
   
   if($this->link_filter_sleeper_check()){
      if($this->log_activity) $this->link_add_log('SIGNATURE',$signature);
      return true; //if it is a sleeper agent just return error, don;t do the check
   }
   
   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_LINK_FOR_YOU']='Antispam: You can\'t have off-site URLs in your sig until you post a few times. ';
   }
   //If there isn't a phpbb3 no_word message add one
   if (empty($user->lang['NO_WORD_FOR_YOU'])){
      $user->lang['NO_WORD_FOR_YOU']='Do you kiss your mom with that mouth? We don\'t want to read that! ';
   }
   
   //make a version of the post and subject
   //need the trailing space or it can hang forever in the while loop if only using a local URL
   $res=$this->link_filter_test(' '.trim($signature).' ');
   
   if($this->log_activity) $this->link_add_log('SIGNATURE',$signature);
   
   return $res;
   
}

/**
*  Test a submitted PM for links and words
*  Returns true if bad things detected
*/
function link_filter_test_pm($message, $subject){
   global $user;
   
   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   //if($this->found_sleeper) return true; //if it is a sleeper agent we still allow PMs to contact an admin, but we still filter them

   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   

   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_LINK_FOR_YOU']='Your message looks too spamy for a new user, please remove off-site URLs.';
   }
   //If there isn't a phpbb3 no_word message add one
   if (empty($user->lang['NO_WORD_FOR_YOU'])){
      $user->lang['NO_WORD_FOR_YOU']='Your message looks too spamy for a new user, please remove bad words or non-english text.';
   }
   
   //make a version of the post and subject
   $res=$this->link_filter_test(' '.trim($message.' '.$subject).' ');
   
   if($this->log_activity) $this->link_add_log('PM',$subject.''.$message);
   
   return $res;   
   
}

/**
*  Test a submitted post for links and words
*  Returns true if bad things detected
*/
function link_filter_test_post($message, $subject){
   global $user;
   
   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   //need to do this here or we don;t get the site's unique help URL
   //better to do it here and then again below that do it for every user before the check
   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   
   
   if( (($user->data['user_posts']==0)||($user->data['user_type']==USER_IGNORE)||($user->data['user_id']==ANONYMOUS))&& (strlen($message)<$this->first_post_length)){//first post, check length
      
      //If there isn't a phpbb3 message add one
      if (empty($user->lang['NO_LINK_TOO_SHORT'])){
         $user->lang['NO_LINK_TOO_SHORT']='Antispam: Sorry, your first post needs to be just a little longer.';
      }
      $this->error[]=$user->lang['NO_LINK_TOO_SHORT'].' '.$this->link_filter_add_help_link();
      $this->found_stuff=$this->found_minwords=true; //flag the error
      
      if($this->log_activity) $this->link_add_log('POST',$subject.''.$message);
      
      return true;
   }
   
   if($this->link_filter_sleeper_check()){
      if($this->log_activity) $this->link_add_log('POST',$subject.''.$message);
      return true; //if it is a sleeper agent just return error, don;t do the check
   }

   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_LINK_FOR_YOU']='Your post looks too spamy for a new user, please remove off-site URLs.';
   }
   //If there isn't a phpbb3 no_word message add one
   if (empty($user->lang['NO_WORD_FOR_YOU'])){
      $user->lang['NO_WORD_FOR_YOU']='Your post looks too spamy for a new user, please remove bad words or non-english text.';
   }
   
   //make a version of the post and subject
   $res=$this->link_filter_test(' '.trim($message.' '.$subject).' ');
   
   if($this->log_activity) $this->link_add_log('POST',$subject.''.$message);
   
   return $res;   
}

/**
*    Do we need to check this user?
*
*/
function link_filter_check()
{
   global $user, $config;

   if($this->load_values_from_db){ //use MOD setting from database
      $this->minimum_days=$config['links_after_num_days'];
      $this->minimum_posts=$config['links_after_num_posts'];
      if($config['links_disable_sleepers']=='1'){
         $this->sleeper_check=true;
      }else{
         $this->sleeper_check=false;
      }
   }

   //check if the user meets filter criteria
   $this->filter_user=((!$user->data['session_admin']) && (($user->data['user_type']==USER_IGNORE) || ($user->data['user_id']==ANONYMOUS) || ($user->data['user_posts']<=($this->minimum_posts-1)) || ($user->data['user_regdate']>((time())-(86400*$this->minimum_days)))));

   //this MIGHT be used in a bigger filter to only apply to new users.
   //$this->filter_user=((!$user->data['session_admin']) && (($user->data['user_type']==USER_IGNORE) || ($user->data['user_id']==ANONYMOUS) || ($user->data['user_new']==1 &&(($user->data['user_posts']<=($this->minimum_posts-1)) || ($user->data['user_regdate']>((time())-(86400*$this->minimum_days)))))));
   
   //If you're not special, we filter you
   return $this->filter_user;
}

/**
*    check for spammers who come back later and  try to post (sleeper agents, or aged accounts)
*
*/
function link_filter_sleeper_check(){
   global $user;
   
   //check for spammers who come back later and  try to post (sleeper agents, or aged accounts)
   if(($this->sleeper_check) && ($user->data['user_id']!=ANONYMOUS) && ($user->data['user_posts']==0) && ($user->data['user_regdate']<=((time())-(86400*$this->minimum_days)))){
      $this->found_sleeper=$this->found_stuff=true;
      
      //If there isn't a phpbb3 sleeper agent message add one
      if (empty($user->lang['NO_SLEEPER_SPAM_FOR_YOU'])){
         $user->lang['NO_SLEEPER_SPAM_FOR_YOU']='Antispam: account disabled, please contact an admin.';
         //could delete the user automatically here
      }
      $this->error[]=$user->lang['NO_SLEEPER_SPAM_FOR_YOU'].' '.$this->link_filter_add_help_link();
      return true;
   }
   return false;
}

/**
*    Add a help link if it exists

*/
function link_filter_add_help_link(){
   if(!empty($this->help_url)){
   
      //If there isn't a phpbb3 message add one
      if (empty($user->lang['HELP_LINK'])){
         $user->lang['HELP_LINK']='Click for help';
      }
      
      return '<a href="'.$this->help_url.'">'.$user->lang['HELP_LINK'].'</a>.';
   }
}

/**
*    if database is enabled this will explore the filter lists into the arrays
*  We move it here so this only happen when we need to filter the user
*/
function link_filter_load_list_from_db(){
   global $config;
   //use MOD setting from database
   $this->whitelist_urls=explode(",", $config['links_allow_always']);
   $this->no_link_strings=explode(",", $config['links_link_strings']);
   $this->no_word_strings=explode(",", $config['links_word_strings']);      
   $this->whitelist_urls=explode(",", $config['links_allow_always']);      
   $this->unicode_filter=$config['links_unicode_filter'];
   $this->minimum_nonunicode_text=(float)$config['links_nonunicode_percent'];
   $this->help_url = $config['links_help_url'];
   $this->first_post_length=$config['links_first_post_words'];
   if($config['links_log_activity']=='1'){
      $this->log_activity=true;
   }else{
      $this->log_activity=false;      
   }
}

/**
*    Search the text for forbidden URLs and text.
*   Add an error to the local error array if found
*   returns true for bad stuff, false for no flags found
*
*/
function link_filter_test($no_link_message){
   global $user, $config;

   //filter the looozers
   //remove line feeds and stuff
   $no_link_message=str_replace('\n', ' ',$no_link_message);
   $no_link_message=str_replace('\r', ' ', $no_link_message);
   //replace double spaces with single spaces (not sure why, white space?)
   while (strpos($no_link_message, '  ')){
      $no_link_message=str_replace('  ', ' ', $no_link_message);
   }
   
   //remove any own-site references, these are ok
   //first change http://mysite.com to mysite.com so we only have to look once below
   $no_link_message=str_replace($config['server_protocol'].$config['server_name'], $config['server_name'], $no_link_message);

   //whitelist other common domains too
   //we do this by relacing them with our own domain so we only have to run the search once below
   for ($x=0;$x<sizeof($this->whitelist_urls);$x++){
      if(stripos($no_link_message, $this->whitelist_urls[$x])){
         $no_link_message=str_ireplace($this->whitelist_urls[$x], $config['server_name'], $no_link_message);   
      }
   }
   
   //look at all instances of mysite.com
   while ($ok_start=stripos($no_link_message, $config['server_name'])){ //start of mysite.com
      $ok_end=strpos($no_link_message, '[', $ok_start); //find next [ (bbcode?)
      if (!$ok_end){ //if not bbcode
         $ok_end=strpos($no_link_message, ' ', $ok_start); //end is position of next space
      }
      if ($ok_end){
         $no_link_message=substr($no_link_message, 0, $ok_start).substr($no_link_message, $ok_end);//remove own URL
      }
   }
      

   //search for each link element, throw an error when found
   for ($x=0;$x<sizeof($this->no_link_strings);$x++){
      if (stripos($no_link_message, $this->no_link_strings[$x])){
         
         $this->found_links=$this->found_stuff=true;
                  
         $this->error[]=$user->lang['NO_LINK_FOR_YOU'].' '.$this->link_filter_add_help_link();
         
         //$x=sizeof($no_link_strings);
         break;//no reason to go further
      }
   }
   
   //search each word, throw an error when found
   for ($x=0;$x<sizeof($this->no_word_strings);$x++){
      if (stripos($no_link_message, $this->no_word_strings[$x])){
      
         $this->found_words=$this->found_stuff=true;
      
         if($this->show_trigger_word){//show the cause so the user isn't stumped
            $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' ('.$this->no_word_strings[$x].') '.$this->link_filter_add_help_link();
         }else{//don't show the cause
            $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' '.$this->link_filter_add_help_link();
         }
         
         //$x=sizeof($no_link_strings);
         break;//no reason to search further
      }
   }
   
   if($this->found_stuff==true) return true; //don't continue checking below
   
   //make a smaller subset for preg_match to save time and cycles
   if(strlen($no_link_message) > 512){
      $no_link_message = substr($no_link_message, 0, 512);
   }
   
   //Check for unicode characters that don't belong in the language of this forum   
   if(!empty($this->unicode_filter)){
      if(preg_match($this->unicode_filter, $no_link_message, $m)==1){ //test for unicode character ranged defined by user
      
         $this->found_unicode=$this->found_stuff=true;

         $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' '.$this->link_filter_add_help_link();
         
         return true; //don't continue checking below
      }
   }
   
   //check the percentage of the text that is NOT unicode
   //see http://www.mawhorter.net/web-development/easily-detecting-if-a-block-of-text-is-written-in-english-non-unicode-languages
   if((!empty($this->minimum_nonunicode_text))){

      //found the length in unicode mode
      $ulen = preg_match_all("#.#u", $no_link_message, $m);
      //find length without unicode
      $len  = preg_match_all('#.#', $no_link_message, $m);
      
      //determine if % of non-unicode is enough
      if(($ulen/$len)<(float)$this->minimum_nonunicode_text){
         $this->found_unicode=$this->found_stuff=true;
         $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' ('.$ulen.'/'.$len.') '.$this->link_filter_add_help_link();
      }   
   }

   return $this->found_stuff;

}
   //add log message   

   //add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'Zombie registration cleanup by Disable links for new users MOD:'.implode(', ', $usernames));
function link_filter_purge_zombies(){
   global $db, $config;
   
   if($this->load_values_from_db){ //use MOD setting from database
      if($config['links_delete_zombies']!='1'){
         if($config['links_log_denials']=='1'){
            add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'spam hammer zombie cleanup: Disabled!');
         }
         return; //honor ACP setting
      }
      
      $this->minimum_days=$config['links_delete_zombies_days']; //$config['links_after_num_days'];
   }
   
   if($this->minimum_days<1){
      if($config['links_log_denials']=='1'){
         add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'spam hammer zombie cleanup: ERROR - days set to 0!');
      }
      return; //don't delete if there is no days setting
   }

   // Get bot ids
   $sql = 'SELECT user_id
      FROM ' . BOTS_TABLE;
   $result = $db->sql_query($sql);

   $bot_ids = array();
   while ($row = $db->sql_fetchrow($result))
   {
      $bot_ids[] = $row['user_id'];
   }
   $db->sql_freeresult($result);

   // Select the group of users to delete
   $sql = 'SELECT user_id, username
      FROM ' . USERS_TABLE . '
      WHERE user_id <> ' . ANONYMOUS . '
         AND user_type <> ' . USER_FOUNDER .'
         AND user_regdate < ' . gmmktime(0, 0, 0, date("m"), (date("d")-$this->minimum_days), date("Y")).'
         AND user_posts = 0';
   $result = $db->sql_query($sql);

   $usernames = array();

   //this would be simpler with a SQL statment, but recycling the ppBB prune functon makes it more robust
   while ($row = $db->sql_fetchrow($result))
   {

      // Do not prune bots and the user currently pruning.
      if (!in_array($row['user_id'], $bot_ids))
      {
         user_delete('remove', $row['user_id']);//delete the user and all posts (there should be none though)
         //user_delete('retain', $row['user_id'],$row['username']); //delete users but not posts, safer just in case
         $usernames[]=$row['username'];//keep the list for the log
      }
   }
   $db->sql_freeresult($result);
   
   //add log message
   add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'spam hammer zombie cleanup:'.implode(', ', $usernames));

}

}//class

?>
Please do not PM or mail with questions. Ask in the forum where everyone can share the answer.
dangerousprototypes
Registered User
 
Posts: 91
Joined: Fri Feb 11, 2011 5:53 am

Re: [DEV] phpBB spam hammer

Postby dangerousprototypes » Wed Feb 23, 2011 3:17 pm

The logging really turned me onto the patterns the bots follow. Register. Hit profile twice. Hit signature twice. Try to post all over. Lots of stuff too, more than I ever imagined. Lots must have been kept out by our previous spambot prevention. No wonder our hosting bill is so high.

This is a fun"EXTREME MODE" version of the class :) I don't recomend it for acutal use!!!!

I still can't get the profile to disappear, but I did add a function that deletes any 0 post user who tries to submit a profile.

I changed the profile warning to say that if you submit it you will be disabled, and if you hit submit the account is deleted and an error is displayed.

It adds a log entry in the admin log when a user is deleted. It is only 0 post users so those deleted can get over it and try again without testing the profile :) The logs are great - register, profile, deleted ;)

I don't recommend you actually use it,it is just for my site, but it IS super effective at stopping spammers before they hit the filter a ton of times.

Find in ucp_profile.php:
Code: Select all
if($f->link_filter_test_profile())//run the check

Replace with
Code: Select all
if($f->link_filter_test_profile($submit))//run the check


Added $submit to the function

Class

Code: Select all
<?php
/**
*
* functions_link_filter.php version r744 - extreme
* @license http://opensource.org/licenses/gpl-license.php GNU Public License
* Modified by Ian Lesnet (http://dangerousprototypes.com)
* Documentation and install info here:
*  http://dangerousprototypes.com/docs/PhpBB3_MOD:_Disable_links_for_new_users
*
*/

/**
* @ignore
*/
if (!defined('IN_PHPBB'))
{
   exit;
}

class link_filter{

   //--settings variables--//
   // An array of no-nos. Add whatever you need...
   private $no_link_strings=array();//('http://', 'www.', '.com', '.net', '.org', '.uk', '.ly', '.me', '.ru', '.biz', '.info', 'dot com', 'dot net', 'dot org', 'dotcom', 'dotnet', 'dotorg', '_com', '_org', '_co_uk', '_ru', 'dot ru');

   //a secondary spam words filter
   private $no_word_strings=array();
   
   private $show_trigger_word=true; //show the user the word that triggered the error
   
   //URLS to ALLOW always. In additon to own-site urls
   private $whitelist_urls=array('http://code.google.com', 'http://sourceforge.net',);
   
   //a regex to filter a unicode character range.
   //Will throw an error if any of these characters are included in a post.
   //leave blank to disable this feature
   //for unicode blocks see: http://www.fileformat.info/info/unicode/block/index.htm
   //0400-04FF - Cyrillic (Russian)
   //0600-06FF - Arabic
   //3100-312F - Mandarin Bopomofo block
   //example of chaining together: '/([\x{0400}-\x{04FF}]|[\x{0600}-\x{06FF}])+/u';
   private $unicode_filter='/([\x{0400}-\x{04FF}]|[\x{0600}-\x{06FF}])+/u';
   
   //what percentage of the post must be non-unicode character (mosly useful for english language sites)
   //leave blank to disable this feature
   private $minimum_nonunicode_text=0.95; //percent of post that must be non-unicode
   
   //where can the user get more help about the filter? Could be a forum post or web page
   //if blank the link will NOT be added to the error messages
   //leave blank to no use this features
   private $help_url = 'http://dangerousprototypes.com/forum/viewtopic.php?f=2&t=1846&p=17886#p17886';
   
   private $load_values_from_db=true; //enable to use with DB, or use the values below
   private $minimum_days=1; //minimum days as member to post links
   private $minimum_posts=1; //minimum posts for member to post links
   
   private $sleeper_check=true; //users with 0 posts who try to post after minimum_days will be prohibited
   
   private $first_post_length=73;//a minimum characters for the first post. enter 0 to disable
   
   private $log_activity=false;//log entry for all activity (not recomended)
   
   private $extreme=true; //deletes accounts for profile abuse
   
   //-- Reporting variables--//
   public $filter_user=false; //we decided to filter this user (they met our criteria)
   
   public $found_stuff=false; //did we find anything? (any item below)
   public $found_sleeper=false;//did we determine this user to be a sleeper agent?
   public $found_links=false; //we found links
   public $found_words=false;//we found bad words
   public $found_minwords=false;//we found too few words
   public $found_unicode=false;//found unicode
   public $found_profile=false;//profiles not allow check positive (for log)
   public $error=array(); //holds text error array

   /*
   *   Log filter actions
   *
   */
   function link_add_log($type,$no_link_message){
      global $user;
      $l='CHECKED '.$type.' of \''.$user->data['username'].'\'. ';
      if($this->found_stuff){
         $l.='DETECTED: ';
         if($this->found_sleeper)$l.='sleeper agent, ';
         if($this->found_links)$l.='links, ';
         if($this->found_words)$l.='bad words, ';
         if($this->found_minwords)$l.='too few words, ';
         if($this->found_unicode)$l.='unicode, ';
         if($this->found_profile)$l.='profile disabled, ';
         $l.='ERRORS: '.implode(', ', $this->error);
         if(!empty($no_link_message)) $l.=' CONTENTS: '.$no_link_message; 
      }else{
         $l.='OK';
      }
      add_log('user', 'LOG_SPAM_HAMMER', 'spam hammer: '.$l);
   }
   
/**
*  Test if user can have a profile yet
*  returns false if they do NOT need to be filtered
*/
function link_filter_test_profile($abuse=false){
   global $user;
   
   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   
   
   if(!$this->link_filter_sleeper_check()){ //if it is a sleeper agent just return error, don;t do the check
      $this->found_stuff=$this->found_profile=true;         
      //If there isn't a phpbb3 no_link message add one
      if (empty($user->lang['NO_LINK_FOR_YOU'])){
         //$user->lang['NO_PROFILE_FOR_YOU']='Antispam: You can\'t have a profile yet. You need to post a few times first.';
         $user->lang['NO_PROFILE_FOR_YOU']='Antispam: DO NOT update the profile yet, you will be DELETED! You need to post a few times first.';
      }
      
      $this->error[]=$user->lang['NO_PROFILE_FOR_YOU'].' '.$this->link_filter_add_help_link();
   }
   if($abuse && $this->extreme && ($user->data['user_posts']==0)){
      add_log('admin', 'LOG_SPAM_HAMMER', 'spam hammer: deleted '.$user->data['username'].' for profile abuse.');
      $this->link_filter_delete_account($user->data['user_id']);
      //$this->error[]='Antispam: Sorry, this account was DELETED due to suspicious behavior.';
      trigger_error('Antispam: Sorry, this account was DELETED due to suspicious behavior. New users are NOT allowed to post a profile.');
   }else{
      if($this->log_activity) $this->link_add_log('PROFILE','');
   }   
   return true;
}   

function link_filter_delete_account($id){
      global $db;
      
      // Get bot ids
      $sql = 'SELECT user_id
         FROM ' . BOTS_TABLE;
      $result = $db->sql_query($sql);

      $bot_ids = array();
      while ($row = $db->sql_fetchrow($result))
      {
         $bot_ids[] = $row['user_id'];
      }
      $db->sql_freeresult($result);

      // Do not prune bots and the user currently pruning.
      if (!in_array($id, $bot_ids))
      {
         user_delete('remove', $id);//delete the user and all posts (there should be none though)
      }

}

/**
*  Test a submitted signature for links and words
*  Returns true if bad things detected
*/
function link_filter_test_signature($signature){
   global $user;

   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   
   
   if($this->link_filter_sleeper_check()){
      if($this->log_activity) $this->link_add_log('SIGNATURE',$signature);
      return true; //if it is a sleeper agent just return error, don;t do the check
   }
   
   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_LINK_FOR_YOU']='Antispam: You can\'t have off-site URLs in your sig until you post a few times. ';
   }
   //If there isn't a phpbb3 no_word message add one
   if (empty($user->lang['NO_WORD_FOR_YOU'])){
      $user->lang['NO_WORD_FOR_YOU']='Do you kiss your mom with that mouth? We don\'t want to read that! ';
   }
   
   //make a version of the post and subject
   //need the trailing space or it can hang forever in the while loop if only using a local URL
   $res=$this->link_filter_test(' '.trim($signature).' ');
   
   if($this->log_activity) $this->link_add_log('SIGNATURE',$signature);
   
   return $res;
   
}

/**
*  Test a submitted PM for links and words
*  Returns true if bad things detected
*/
function link_filter_test_pm($message, $subject){
   global $user;
   
   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   //if($this->found_sleeper) return true; //if it is a sleeper agent we still allow PMs to contact an admin, but we still filter them

   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   

   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_LINK_FOR_YOU']='Your message looks too spamy for a new user, please remove off-site URLs.';
   }
   //If there isn't a phpbb3 no_word message add one
   if (empty($user->lang['NO_WORD_FOR_YOU'])){
      $user->lang['NO_WORD_FOR_YOU']='Your message looks too spamy for a new user, please remove bad words or non-english text.';
   }
   
   //make a version of the post and subject
   $res=$this->link_filter_test(' '.trim($message.' '.$subject).' ');
   
   if($this->log_activity) $this->link_add_log('PM',$subject.''.$message);
   
   return $res;   
   
}

/**
*  Test a submitted post for links and words
*  Returns true if bad things detected
*/
function link_filter_test_post($message, $subject){
   global $user;
   
   //do we need to check this user?
   if(!$this->link_filter_check()) return false; //don't check, no error
   
   //need to do this here or we don;t get the site's unique help URL
   //better to do it here and then again below that do it for every user before the check
   if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured   
   
   if( (($user->data['user_posts']==0)||($user->data['user_type']==USER_IGNORE)||($user->data['user_id']==ANONYMOUS))&& (strlen($message)<$this->first_post_length)){//first post, check length
      
      //If there isn't a phpbb3 message add one
      if (empty($user->lang['NO_LINK_TOO_SHORT'])){
         $user->lang['NO_LINK_TOO_SHORT']='Antispam: Sorry, your first post needs to be just a little longer.';
      }
      $this->error[]=$user->lang['NO_LINK_TOO_SHORT'].' '.$this->link_filter_add_help_link();
      $this->found_stuff=$this->found_minwords=true; //flag the error
      
      if($this->log_activity) $this->link_add_log('POST',$subject.''.$message);
      
      return true;
   }
   
   if($this->link_filter_sleeper_check()){
      if($this->log_activity) $this->link_add_log('POST',$subject.''.$message);
      return true; //if it is a sleeper agent just return error, don;t do the check
   }

   //If there isn't a phpbb3 no_link message add one
   if (empty($user->lang['NO_LINK_FOR_YOU'])){
      $user->lang['NO_LINK_FOR_YOU']='Your post looks too spamy for a new user, please remove off-site URLs.';
   }
   //If there isn't a phpbb3 no_word message add one
   if (empty($user->lang['NO_WORD_FOR_YOU'])){
      $user->lang['NO_WORD_FOR_YOU']='Your post looks too spamy for a new user, please remove bad words or non-english text.';
   }
   
   //make a version of the post and subject
   $res=$this->link_filter_test(' '.trim($message.' '.$subject).' ');
   
   if($this->log_activity) $this->link_add_log('POST',$subject.''.$message);
   
   return $res;   
}

/**
*    Do we need to check this user?
*
*/
function link_filter_check()
{
   global $user, $config;

   if($this->load_values_from_db){ //use MOD setting from database
      $this->minimum_days=$config['links_after_num_days'];
      $this->minimum_posts=$config['links_after_num_posts'];
      if($config['links_disable_sleepers']=='1'){
         $this->sleeper_check=true;
      }else{
         $this->sleeper_check=false;
      }
   }

   //check if the user meets filter criteria
   $this->filter_user=((!$user->data['session_admin']) && (($user->data['user_type']==USER_IGNORE) || ($user->data['user_id']==ANONYMOUS) || ($user->data['user_posts']<=($this->minimum_posts-1)) || ($user->data['user_regdate']>((time())-(86400*$this->minimum_days)))));

   //this MIGHT be used in a bigger filter to only apply to new users.
   //$this->filter_user=((!$user->data['session_admin']) && (($user->data['user_type']==USER_IGNORE) || ($user->data['user_id']==ANONYMOUS) || ($user->data['user_new']==1 &&(($user->data['user_posts']<=($this->minimum_posts-1)) || ($user->data['user_regdate']>((time())-(86400*$this->minimum_days)))))));
   
   //If you're not special, we filter you
   return $this->filter_user;
}

/**
*    check for spammers who come back later and  try to post (sleeper agents, or aged accounts)
*
*/
function link_filter_sleeper_check(){
   global $user;
   
   //check for spammers who come back later and  try to post (sleeper agents, or aged accounts)
   if(($this->sleeper_check) && ($user->data['user_id']!=ANONYMOUS) && ($user->data['user_posts']==0) && ($user->data['user_regdate']<=((time())-(86400*$this->minimum_days)))){
      $this->found_sleeper=$this->found_stuff=true;
      
      //If there isn't a phpbb3 sleeper agent message add one
      if (empty($user->lang['NO_SLEEPER_SPAM_FOR_YOU'])){
         $user->lang['NO_SLEEPER_SPAM_FOR_YOU']='Antispam: account disabled, please contact an admin.';
         //could delete the user automatically here
      }
      $this->error[]=$user->lang['NO_SLEEPER_SPAM_FOR_YOU'].' '.$this->link_filter_add_help_link();
      return true;
   }
   return false;
}

/**
*    Add a help link if it exists

*/
function link_filter_add_help_link(){
   if(!empty($this->help_url)){
   
      //If there isn't a phpbb3 message add one
      if (empty($user->lang['HELP_LINK'])){
         $user->lang['HELP_LINK']='Click for help';
      }
      
      return '<a href="'.$this->help_url.'">'.$user->lang['HELP_LINK'].'</a>.';
   }
}

/**
*    if database is enabled this will explore the filter lists into the arrays
*  We move it here so this only happen when we need to filter the user
*/
function link_filter_load_list_from_db(){
   global $config;
   //use MOD setting from database
   $this->whitelist_urls=explode(",", $config['links_allow_always']);
   $this->no_link_strings=explode(",", $config['links_link_strings']);
   $this->no_word_strings=explode(",", $config['links_word_strings']);      
   $this->whitelist_urls=explode(",", $config['links_allow_always']);      
   $this->unicode_filter=$config['links_unicode_filter'];
   $this->minimum_nonunicode_text=(float)$config['links_nonunicode_percent'];
   $this->help_url = $config['links_help_url'];
   $this->first_post_length=$config['links_first_post_words'];
   if($config['links_log_activity']=='1'){
      $this->log_activity=true;
   }else{
      $this->log_activity=false;      
   }
}

/**
*    Search the text for forbidden URLs and text.
*   Add an error to the local error array if found
*   returns true for bad stuff, false for no flags found
*
*/
function link_filter_test($no_link_message){
   global $user, $config;

   //filter the looozers
   //remove line feeds and stuff
   $no_link_message=str_replace('\n', ' ',$no_link_message);
   $no_link_message=str_replace('\r', ' ', $no_link_message);
   //replace double spaces with single spaces (not sure why, white space?)
   while (strpos($no_link_message, '  ')){
      $no_link_message=str_replace('  ', ' ', $no_link_message);
   }
   
   //remove any own-site references, these are ok
   //first change http://mysite.com to mysite.com so we only have to look once below
   $no_link_message=str_replace($config['server_protocol'].$config['server_name'], $config['server_name'], $no_link_message);

   //whitelist other common domains too
   //we do this by relacing them with our own domain so we only have to run the search once below
   for ($x=0;$x<sizeof($this->whitelist_urls);$x++){
      if(stripos($no_link_message, $this->whitelist_urls[$x])){
         $no_link_message=str_ireplace($this->whitelist_urls[$x], $config['server_name'], $no_link_message);   
      }
   }
   
   //look at all instances of mysite.com
   while ($ok_start=stripos($no_link_message, $config['server_name'])){ //start of mysite.com
      $ok_end=strpos($no_link_message, '[', $ok_start); //find next [ (bbcode?)
      if (!$ok_end){ //if not bbcode
         $ok_end=strpos($no_link_message, ' ', $ok_start); //end is position of next space
      }
      if ($ok_end){
         $no_link_message=substr($no_link_message, 0, $ok_start).substr($no_link_message, $ok_end);//remove own URL
      }
   }
      

   //search for each link element, throw an error when found
   for ($x=0;$x<sizeof($this->no_link_strings);$x++){
      if (stripos($no_link_message, $this->no_link_strings[$x])){
         
         $this->found_links=$this->found_stuff=true;
                  
         $this->error[]=$user->lang['NO_LINK_FOR_YOU'].' '.$this->link_filter_add_help_link();
         
         //$x=sizeof($no_link_strings);
         break;//no reason to go further
      }
   }
   
   //search each word, throw an error when found
   for ($x=0;$x<sizeof($this->no_word_strings);$x++){
      if (stripos($no_link_message, $this->no_word_strings[$x])){
      
         $this->found_words=$this->found_stuff=true;
      
         if($this->show_trigger_word){//show the cause so the user isn't stumped
            $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' ('.$this->no_word_strings[$x].') '.$this->link_filter_add_help_link();
         }else{//don't show the cause
            $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' '.$this->link_filter_add_help_link();
         }
         
         //$x=sizeof($no_link_strings);
         break;//no reason to search further
      }
   }
   
   if($this->found_stuff==true) return true; //don't continue checking below
   
   //make a smaller subset for preg_match to save time and cycles
   if(strlen($no_link_message) > 512){
      $no_link_message = substr($no_link_message, 0, 512);
   }
   
   //Check for unicode characters that don't belong in the language of this forum   
   if(!empty($this->unicode_filter)){
      if(preg_match($this->unicode_filter, $no_link_message, $m)==1){ //test for unicode character ranged defined by user
      
         $this->found_unicode=$this->found_stuff=true;

         $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' '.$this->link_filter_add_help_link();
         
         return true; //don't continue checking below
      }
   }
   
   //check the percentage of the text that is NOT unicode
   //see http://www.mawhorter.net/web-development/easily-detecting-if-a-block-of-text-is-written-in-english-non-unicode-languages
   if((!empty($this->minimum_nonunicode_text))){

      //found the length in unicode mode
      $ulen = preg_match_all("#.#u", $no_link_message, $m);
      //find length without unicode
      $len  = preg_match_all('#.#', $no_link_message, $m);
      
      //determine if % of non-unicode is enough
      if(($ulen/$len)<(float)$this->minimum_nonunicode_text){
         $this->found_unicode=$this->found_stuff=true;
         $this->error[]=$user->lang['NO_WORD_FOR_YOU'].' ('.$ulen.'/'.$len.') '.$this->link_filter_add_help_link();
      }   
   }

   return $this->found_stuff;

}
   //add log message   

   //add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'Zombie registration cleanup by Disable links for new users MOD:'.implode(', ', $usernames));
function link_filter_purge_zombies(){
   global $db, $config;
   
   if($this->load_values_from_db){ //use MOD setting from database
      if($config['links_delete_zombies']!='1'){
         if($config['links_log_denials']=='1'){
            add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'spam hammer zombie cleanup: Disabled!');
         }
         return; //honor ACP setting
      }
      
      $this->minimum_days=$config['links_delete_zombies_days']; //$config['links_after_num_days'];
   }
   
   if($this->minimum_days<1){
      if($config['links_log_denials']=='1'){
         add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'spam hammer zombie cleanup: ERROR - days set to 0!');
      }
      return; //don't delete if there is no days setting
   }

   // Get bot ids
   $sql = 'SELECT user_id
      FROM ' . BOTS_TABLE;
   $result = $db->sql_query($sql);

   $bot_ids = array();
   while ($row = $db->sql_fetchrow($result))
   {
      $bot_ids[] = $row['user_id'];
   }
   $db->sql_freeresult($result);

   // Select the group of users to delete
   $sql = 'SELECT user_id, username
      FROM ' . USERS_TABLE . '
      WHERE user_id <> ' . ANONYMOUS . '
         AND user_type <> ' . USER_FOUNDER .'
         AND user_regdate < ' . gmmktime(0, 0, 0, date("m"), (date("d")-$this->minimum_days), date("Y")).'
         AND user_posts = 0';
   $result = $db->sql_query($sql);

   $usernames = array();

   //this would be simpler with a SQL statment, but recycling the ppBB prune functon makes it more robust
   while ($row = $db->sql_fetchrow($result))
   {

      // Do not prune bots and the user currently pruning.
      if (!in_array($row['user_id'], $bot_ids))
      {
         user_delete('remove', $row['user_id']);//delete the user and all posts (there should be none though)
         //user_delete('retain', $row['user_id'],$row['username']); //delete users but not posts, safer just in case
         $usernames[]=$row['username'];//keep the list for the log
      }
   }
   $db->sql_freeresult($result);
   
   //add log message
   add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'spam hammer zombie cleanup:'.implode(', ', $usernames));

}

}//class

?>
Last edited by dangerousprototypes on Wed Feb 23, 2011 4:02 pm, edited 2 times in total.
Please do not PM or mail with questions. Ask in the forum where everyone can share the answer.
dangerousprototypes
Registered User
 
Posts: 91
Joined: Fri Feb 11, 2011 5:53 am

Re: [DEV] phpBB spam hammer

Postby Philthy » Wed Feb 23, 2011 3:52 pm

I updated the install package as you posted DP.
http://www.skidvd.co.uk/files/disable_links_words_spam_0.0.2.zip

It includes the r744 class, but not the above extreme version.

Can you clarify your post a bit, what edits need to be done for the extreme version?
Cheers.
Go on ! it's not as steep as it looks.....
Philthy
Registered User
 
Posts: 210
Joined: Tue Dec 27, 2005 10:05 am
Location: Dawlish, Devon

Re: [DEV] phpBB spam hammer

Postby dangerousprototypes » Wed Feb 23, 2011 4:10 pm

Thanks!

I updated the SVN with R745. It has extreme features, but they are disabled unless you set var extreme=true; in the class. It also has some minor fixes to logging for account pruning. Should be a drop-in upgrade for anyone with 0.0.2 installed.

I updated the top post, I also updated my previous post to show the change needed for extreme mode.

Find in ucp_profile.php:
Code: Select all
if($f->link_filter_test_profile())//run the check

Replace with
Code: Select all
if($f->link_filter_test_profile($submit))//run the check


In addition, you'll need to find in the class
Code: Select all
var $extreme=false;

change it to
Code: Select all
var $extreme=true;


This probably shouldn't be an ACP feature because it is kind of honeypot and certainly not for everyone.
Last edited by dangerousprototypes on Wed Feb 23, 2011 4:22 pm, edited 1 time in total.
Please do not PM or mail with questions. Ask in the forum where everyone can share the answer.
dangerousprototypes
Registered User
 
Posts: 91
Joined: Fri Feb 11, 2011 5:53 am

Next

Return to [3.0.x] MODs in Development

Who is online

Users browsing this forum: Google [Bot], TheSa|nt, wintstar, youngjediknight and 38 guests