Thanks for the great feedback! Found some bugs, an updated class is at the end of this post, or you can
get the latest version from SVN.
---
1: The sql insertion entered the fields in the database, but not the values. I simply entered them manually from the admin panel.
In my experience they do get set (in the DB), but not picked up by phpBB. Then phpBB gives a SQL error on first update, but a refresh clears it up. I copied the SQL statements from the previous mod exactly, even though I wanted to combine them to a single statement

I don;t understand this at all, maybe there is an additional field that needs to be set, or maybe some of the text in the query is causing issues.
2: I found the 73 character minimum first post to be a little too restrictive, and edited it manually in the class. Maybe something that could be set in the admin panel?
That is a goal, but adding new ACP options is a huge pain.
I have only set three spam words to filter, the popular blue pill being one of the words. I'm hoping to harvest others from any posts that may make it through.
There is a harvested list of spam words in the documentation too.
DP, do you want an admin login to test with?
Not unless you have problems with the mod.
---
I tried this last time. When i would hit reply it would take me to your install instructions? And when using quickreply itll let users with 0 posts post links.
I can only think that the install went horribly wrong, or there is a conflicting plugin. The only place the install instructions link is located is in a comment at the top of the class. There is no code in the class that would do that, so non-code must be executing. I'll assume the quickreply issue is related until we find out what causes the other bug, quick reply is filtered without problem on my site (quick reply and regular reply are processed by the same phpBB code).
---
I added a few posts to it earlier, pulled from my own spam cemetery...plus that long list of domain extensions. One set of extensions, I didn't think about are those used by URL shortening services.
Interesting list. I noticed a few things in the honeypot:
*There are no actual links. Maybe viagrra dot com is unwanted, or spam.ly, but it's not a list of 100s of linked URLs ready for clicking and search engine indexing. I consider that a success

We'll never get around a determined spammer, but we can make it hard, make them obvious, and keep them from getting what they want (a link). The http:// filter kills any auto-linking as far as I know. I actually don't disallow dot com, etc, in the forum because I want new users to still be able to post a link if they need to, but it doesn't do spammers much good with the search engines.
*Similar with broken words (men like viagr a for fore x marke t trading). If a spammer goes to so much effort for a such a small gain, there's not much that can be done. I actually see this as a benefit because normal users can avoid an overly aggressive filter, but spammer still don't get any SEO or click benefit.
*Some words in the subject are not filtered (may not be enabled on the test site, such as f***ing).
Also, might there be a way to 'record' an entry to the user log whenever a post gets denied and the reason (or perhaps another, unique "Spam log")?
Adding to the log is pretty easy, but every entry is a round-trip to the database which eats up cycles and space. No idea about a new log.
*In the installation instructions for editing /includes/acp/acp_board.php part of the new code includes adding 'legend3' but there is already a 'legend3' just below the area to add the new text in question.... should the pre-existing legend3 be renamed to legend4? If so, this should be in the instructions.
The legend is just a break, I don;t think the name is important, but if it is duplicated I suppose everything should be updated.
*Disable sleeper agents looks to be enabled by default. Is this setting/function still active without the editing to cron.php? If true, and since it's unclear how to 're-enable' a disabled account, this should probably be disabled on a default install.
Disable sleepers only disables accounts that have not posted at all during the minimum time window. It can be disabled from the ACP, and currently there is no default option because the setting we enter don't 'take' for some reason
Edits to cron.php are only for the zombie registration purge.
*Are all the language strings stored in functions_link_filter.php? I'd like to edit some of these but it could be tiresome if they need ed-edited every time this file is updated.
For the user yes, but the ACP strings are in the ACP file. I think the correct way to customize the strings is to add them to your own language file. All the text does something like this:
Code: Select all
//If there isn't a phpbb3 sleeper agent message add one
if (empty($user->lang['NO_SLEEPER_SPAM_FOR_YOU'])){
$user->lang['NO_SLEEPER_SPAM_FOR_YOU']='Antispam: account disabled, please contact an admin.';
//could delete the user automatically here
}
So it only adds the default if one doesn't exist already. I don't know where to do this though, I don't really know anything about phpBB.
Another issue... I left the Filter Help Link empty in ACP but the antispam message in UCP/Profile is still pointing to
Thanks! I forgot the initialize the database variable before doing sleeper check and attaching some of the error messages. It is fixed in the attached class, just replace the exiting class file in /includes/.
---
It sounds like I should go ahead and add these options to the ACP:
*Add minimum first post size to ACP
*Enable/disable, configure zombie purge from ACP
---
Latest class version r737:
Code: Select all
<?php
/**
*
* functions_link_filter.php version r737
* @license http://opensource.org/licenses/gpl-license.php GNU Public License
* Modified by Ian Lesnet (http://dangerousprototypes.com)
* Documentation and install info here:
* http://dangerousprototypes.com/docs/PhpBB3_MOD:_Disable_links_for_new_users
*
*/
/**
* @ignore
*/
if (!defined('IN_PHPBB'))
{
exit;
}
class link_filter{
//--settings variables--//
// An array of no-nos. Add whatever you need...
private $no_link_strings=array();//('http://', 'www.', '.com', '.net', '.org', '.uk', '.ly', '.me', '.ru', '.biz', '.info', 'dot com', 'dot net', 'dot org', 'dotcom', 'dotnet', 'dotorg', '_com', '_org', '_co_uk', '_ru', 'dot ru');
//a secondary spam words filter
private $no_word_strings=array();
private $show_trigger_word=true; //show the user the word that triggered the error
//URLS to ALLOW always. In additon to own-site urls
private $whitelist_urls=array('http://code.google.com', 'http://sourceforge.net',);
//a regex to filter a unicode character range.
//Will throw an error if any of these characters are included in a post.
//leave blank to disable this feature
//for unicode blocks see: http://www.fileformat.info/info/unicode/block/index.htm
//0400-04FF - Cyrillic (Russian)
//0600-06FF - Arabic
//3100-312F - Mandarin Bopomofo block
//example of chaining together: '/([\x{0400}-\x{04FF}]|[\x{0600}-\x{06FF}])+/u';
private $unicode_filter='/([\x{0400}-\x{04FF}]|[\x{0600}-\x{06FF}])+/u';
//what percentage of the post must be non-unicode character (mosly useful for english language sites)
//leave blank to disable this feature
private $minimum_nonunicode_text=0.95; //percent of post that must be non-unicode
//where can the user get more help about the filter? Could be a forum post or web page
//if blank the link will NOT be added to the error messages
//leave blank to no use this features
private $help_url = 'http://dangerousprototypes.com/forum/viewtopic.php?f=2&t=1846&p=17886#p17886';
private $load_values_from_db=true; //enable to use with DB, or use the values below
private $minimum_days=1; //minimum days as member to post links
private $minimum_posts=1; //minimum posts for member to post links
private $sleeper_check=true; //users with 0 posts who try to post after minimum_days will be prohibited
private $first_post_length=73;//a minimum characters for the first post. enter 0 to disable
//-- Reporting variables--//
public $filter_user=false; //we decided to filter this user (they met our criteria)
public $found_stuff=false; //did we find anything? (any item below)
public $found_sleeper=false;//did we determine this user to be a sleeper agent?
public $found_links=false; //we found links
public $found_words=false;//we found bad words
public $found_unicode=false;
public $error=array(); //holds text error array
/**
* Test if user can have a profile yet
* returns false if they do NOT need to be filtered
*/
function link_filter_test_profile(){
global $user;
//do we need to check this user?
if(!$this->link_filter_check()) return false; //don't check, no error
if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured
if($this->link_filter_sleeper_check()) return true; //if it is a sleeper agent just return error, don;t do the check
//If there isn't a phpbb3 no_link message add one
if (empty($user->lang['NO_LINK_FOR_YOU'])){
$user->lang['NO_PROFILE_FOR_YOU']='Antispam: You can\'t have a profile yet. You need to post a few times first.';
}
$this->error[]=$user->lang['NO_PROFILE_FOR_YOU'].' '.$this->link_filter_add_help_link();
return true;
}
/**
* Test a submitted signature for links and words
* Returns true if bad things detected
*/
function link_filter_test_signature($signature){
global $user;
//do we need to check this user?
if(!$this->link_filter_check()) return false; //don't check, no error
if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured
if($this->link_filter_sleeper_check()) return true; //if it is a sleeper agent just return error, don;t do the check
//If there isn't a phpbb3 no_link message add one
if (empty($user->lang['NO_LINK_FOR_YOU'])){
$user->lang['NO_LINK_FOR_YOU']='Antispam: You can\'t have off-site URLs in your sig until you post a few times. ';
}
//If there isn't a phpbb3 no_word message add one
if (empty($user->lang['NO_WORD_FOR_YOU'])){
$user->lang['NO_WORD_FOR_YOU']='Do you kiss your mom with that mouth? We don\'t want to read that! ';
}
//make a version of the post and subject
//need the trailing space or it can hang forever in the while loop if only using a local URL
return $this->link_filter_test(' '.trim($signature).' ');
}
/**
* Test a submitted PM for links and words
* Returns true if bad things detected
*/
function link_filter_test_pm($message, $subject){
global $user;
//do we need to check this user?
if(!$this->link_filter_check()) return false; //don't check, no error
//if($this->found_sleeper) return true; //if it is a sleeper agent we still allow PMs to contact an admin, but we still filter them
if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured
//If there isn't a phpbb3 no_link message add one
if (empty($user->lang['NO_LINK_FOR_YOU'])){
$user->lang['NO_LINK_FOR_YOU']='Your message looks too spamy for a new user, please remove off-site URLs.';
}
//If there isn't a phpbb3 no_word message add one
if (empty($user->lang['NO_WORD_FOR_YOU'])){
$user->lang['NO_WORD_FOR_YOU']='Your message looks too spamy for a new user, please remove bad words or non-english text.';
}
//make a version of the post and subject
return $this->link_filter_test(' '.trim($message.' '.$subject).' ');
}
/**
* Test a submitted post for links and words
* Returns true if bad things detected
*/
function link_filter_test_post($message, $subject){
global $user;
//do we need to check this user?
if(!$this->link_filter_check()) return false; //don't check, no error
//need to do this here or we don;t get the site's unique help URL
//better to do it here and then again below that do it for every user before the check
if($this->load_values_from_db) $this->link_filter_load_list_from_db(); //load list values from DB if configured
if( (($user->data['user_posts']==0)||($user->data['user_type']==USER_IGNORE)||($user->data['user_id']==ANONYMOUS))&& (strlen($message)<$this->first_post_length)){//first post, check length
//If there isn't a phpbb3 message add one
if (empty($user->lang['NO_LINK_TOO_SHORT'])){
$user->lang['NO_LINK_TOO_SHORT']='Antispam: Sorry, your first post needs to be just a little longer.';
}
$this->error[]=$user->lang['NO_LINK_TOO_SHORT'].' '.$this->link_filter_add_help_link();
return true;
}
if($this->link_filter_sleeper_check()) return true; //if it is a sleeper agent just return error, don;t do the check
//If there isn't a phpbb3 no_link message add one
if (empty($user->lang['NO_LINK_FOR_YOU'])){
$user->lang['NO_LINK_FOR_YOU']='Your post looks too spamy for a new user, please remove off-site URLs.';
}
//If there isn't a phpbb3 no_word message add one
if (empty($user->lang['NO_WORD_FOR_YOU'])){
$user->lang['NO_WORD_FOR_YOU']='Your post looks too spamy for a new user, please remove bad words or non-english text.';
}
//make a version of the post and subject
return $this->link_filter_test(' '.trim($message.' '.$subject).' ');
}
/**
* Do we need to check this user?
*
*/
function link_filter_check()
{
global $user, $config;
if($this->load_values_from_db){ //use MOD setting from database
$this->minimum_days=$config['links_after_num_days'];
$this->minimum_posts=$config['links_after_num_posts'];
if($config['links_disable_sleepers']=='1'){
$this->sleeper_check=true;
}else{
$this->sleeper_check=false;
}
}
//check if the user meets filter criteria
$this->filter_user=((!$user->data['session_admin']) && (($user->data['user_type']==USER_IGNORE) || ($user->data['user_id']==ANONYMOUS) || ($user->data['user_posts']<=($this->minimum_posts-1)) || ($user->data['user_regdate']>((time())-(86400*$this->minimum_days)))));
//this MIGHT be used in a bigger filter to only apply to new users.
//$this->filter_user=((!$user->data['session_admin']) && (($user->data['user_type']==USER_IGNORE) || ($user->data['user_id']==ANONYMOUS) || ($user->data['user_new']==1 &&(($user->data['user_posts']<=($this->minimum_posts-1)) || ($user->data['user_regdate']>((time())-(86400*$this->minimum_days)))))));
//If you're not special, we filter you
return $this->filter_user;
}
/**
* check for spammers who come back later and try to post (sleeper agents, or aged accounts)
*
*/
function link_filter_sleeper_check(){
global $user;
//check for spammers who come back later and try to post (sleeper agents, or aged accounts)
if(($this->sleeper_check) && ($user->data['user_id']!=ANONYMOUS) && ($user->data['user_posts']==0) && ($user->data['user_regdate']<=((time())-(86400*$this->minimum_days)))){
$this->found_sleeper=$this->found_stuff=true;
//If there isn't a phpbb3 sleeper agent message add one
if (empty($user->lang['NO_SLEEPER_SPAM_FOR_YOU'])){
$user->lang['NO_SLEEPER_SPAM_FOR_YOU']='Antispam: account disabled, please contact an admin.';
//could delete the user automatically here
}
$this->error[]=$user->lang['NO_SLEEPER_SPAM_FOR_YOU'].' '.$this->link_filter_add_help_link();
return true;
}
return false;
}
/**
* Add a help link if it exists
*
*/
function link_filter_add_help_link(){
if(!empty($this->help_url)){
//If there isn't a phpbb3 message add one
if (empty($user->lang['HELP_LINK'])){
$user->lang['HELP_LINK']='Click for help';
}
return '<a href="'.$this->help_url.'">'.$user->lang['HELP_LINK'].'</a>.';
}
}
/**
* if database is enabled this will explore the filter lists into the arrays
* We move it here so this only happen when we need to filter the user
*/
function link_filter_load_list_from_db(){
global $config;
//use MOD setting from database
$this->whitelist_urls=explode(",", $config['links_allow_always']);
$this->no_link_strings=explode(",", $config['links_link_strings']);
$this->no_word_strings=explode(",", $config['links_word_strings']);
$this->whitelist_urls=explode(",", $config['links_allow_always']);
$this->unicode_filter=$config['links_unicode_filter'];
$this->minimum_nonunicode_text=(float)$config['links_nonunicode_percent'];
$this->help_url = $config['links_help_url'];
}
/**
* Search the text for forbidden URLs and text.
* Add an error to the local error array if found
* returns true for bad stuff, false for no flags found
*
*/
function link_filter_test($no_link_message){
global $user, $config;
//filter the looozers
//remove line feeds and stuff
$no_link_message=str_replace('\n', ' ',$no_link_message);
$no_link_message=str_replace('\r', ' ', $no_link_message);
//replace double spaces with single spaces (not sure why, white space?)
while (strpos($no_link_message, ' ')){
$no_link_message=str_replace(' ', ' ', $no_link_message);
}
//remove any own-site references, these are ok
//first change http://mysite.com to mysite.com so we only have to look once below
$no_link_message=str_replace($config['server_protocol'].$config['server_name'], $config['server_name'], $no_link_message);
//whitelist other common domains too
//we do this by relacing them with our own domain so we only have to run the search once below
for ($x=0;$x<sizeof($this->whitelist_urls);$x++){
if(stripos($no_link_message, $this->whitelist_urls[$x])){
$no_link_message=str_ireplace($this->whitelist_urls[$x], $config['server_name'], $no_link_message);
}
}
//look at all instances of mysite.com
while ($ok_start=stripos($no_link_message, $config['server_name'])){ //start of mysite.com
$ok_end=strpos($no_link_message, '[', $ok_start); //find next [ (bbcode?)
if (!$ok_end){ //if not bbcode
$ok_end=strpos($no_link_message, ' ', $ok_start); //end is position of next space
}
if ($ok_end){
$no_link_message=substr($no_link_message, 0, $ok_start).substr($no_link_message, $ok_end);//remove own URL
}
}
//search for each link element, throw an error when found
for ($x=0;$x<sizeof($this->no_link_strings);$x++){
if (stripos($no_link_message, $this->no_link_strings[$x])){
$this->found_links=$this->found_stuff=true;
$this->error[]=$user->lang['NO_LINK_FOR_YOU'].' '.$this->link_filter_add_help_link();
//$x=sizeof($no_link_strings);
break;//no reason to go further
}
}
//search each word, throw an error when found
for ($x=0;$x<sizeof($this->no_word_strings);$x++){
if (stripos($no_link_message, $this->no_word_strings[$x])){
$this->found_words=$this->found_stuff=true;
if($this->show_trigger_word){//show the cause so the user isn't stumped
$this->error[]=$user->lang['NO_WORD_FOR_YOU'].' ('.$this->no_word_strings[$x].') '.$this->link_filter_add_help_link();
}else{//don't show the cause
$this->error[]=$user->lang['NO_WORD_FOR_YOU'].' '.$this->link_filter_add_help_link();
}
//$x=sizeof($no_link_strings);
break;//no reason to search further
}
}
if($this->found_stuff==true) return true; //don't continue checking below
//make a smaller subset for preg_match to save time and cycles
if(strlen($no_link_message) > 512){
$no_link_message = substr($no_link_message, 0, 512);
}
//Check for unicode characters that don't belong in the language of this forum
if(!empty($this->unicode_filter)){
if(preg_match($this->unicode_filter, $no_link_message, $m)==1){ //test for unicode character ranged defined by user
$this->found_unicode=$this->found_stuff=true;
$this->error[]=$user->lang['NO_WORD_FOR_YOU'].' '.$this->link_filter_add_help_link();
return true; //don't continue checking below
}
}
//check the percentage of the text that is NOT unicode
//see http://www.mawhorter.net/web-development/easily-detecting-if-a-block-of-text-is-written-in-english-non-unicode-languages
if((!empty($this->minimum_nonunicode_text))){
//found the length in unicode mode
$ulen = preg_match_all("#.#u", $no_link_message, $m);
//find length without unicode
$len = preg_match_all('#.#', $no_link_message, $m);
//determine if % of non-unicode is enough
if(($ulen/$len)<(float)$this->minimum_nonunicode_text){
$this->found_unicode=$this->found_stuff=true;
$this->error[]=$user->lang['NO_WORD_FOR_YOU'].' ('.$ulen.'/'.$len.') '.$this->link_filter_add_help_link();
}
}
return $this->found_stuff;
}
function link_filter_purge_zombies(){
global $db, $config;
if($this->load_values_from_db){ //use MOD setting from database
$this->minimum_days=$config['links_after_num_days'];
}
if($this->minimum_days<1) return; //don't delete if there is no days setting
// Get bot ids
$sql = 'SELECT user_id
FROM ' . BOTS_TABLE;
$result = $db->sql_query($sql);
$bot_ids = array();
while ($row = $db->sql_fetchrow($result))
{
$bot_ids[] = $row['user_id'];
}
$db->sql_freeresult($result);
// Select the group of users to delete
$sql = 'SELECT user_id, username
FROM ' . USERS_TABLE . '
WHERE user_id <> ' . ANONYMOUS . '
AND user_type <> ' . USER_FOUNDER .'
AND user_regdate < ' . gmmktime(0, 0, 0, date("m"), (date("d")-$this->minimum_days), date("Y")).'
AND user_posts = 0';
$result = $db->sql_query($sql);
$usernames = array();
//this would be simpler with a SQL statment, but recycling the ppBB prune functon makes it more robust
while ($row = $db->sql_fetchrow($result))
{
// Do not prune bots and the user currently pruning.
if (!in_array($row['user_id'], $bot_ids))
{
user_delete('remove', $row['user_id']);//delete the user and all posts (there should be none though)
//user_delete('retain', $row['user_id'],$row['username']); //delete users but not posts, safer just in case
$usernames[]=$row['username'];//keep the list for the log
}
}
$db->sql_freeresult($result);
//add log message
add_log('admin', 'LOG_PRUNE_USER_DEL_DEL', 'Zombie registration cleanup by Disable links for new users MOD:'.implode(', ', $usernames));
}
}//class
?>
Please do not PM or mail with questions. Ask in the forum where everyone can share the answer.