[ABD] Full-text search Mod (MySQL only) [/]

A place for MOD Authors to post and receive feedback on MODs still in development. No MODs within this forum should be used within a live environment! No new topics are allowed in this forum.
Forum rules
READ: phpBB.com Board-Wide Rules and Regulations

IMPORTANT: MOD Development Forum rules

On February 1, 2009 this forum will be set to read only as part of retiring of phpBB2.
Gamejag
Registered User
Posts: 16
Joined: Mon Mar 31, 2003 6:46 pm

[ABD] Full-text search Mod (MySQL only) [/]

Post by Gamejag »

First off, I know that phpBB already does a full-text search, so why would you want to use this? Well, my forum is set up mainly for Fan Fiction, and I have 175,000 posts in it. Given that the posts are generally longish, there are a lot of words to be added to the wordsearch and wordmatch tables. I had almost 16 million rows in the wordmatch table, and although the site functioned well everywhere else, posts took quite a while to be entered and at times longer ones would actually time-out. We were getting a lot of double-posts when users didn't realize that their post got inserted once they got the time-out screen.

Another consideration was space. The wordmatch table and index took up 380MB by itself. The posts_text table was about 270MB. Considering I pay for 1GB of database space a month, this was becoming a real concern for me. The current posts_text table is now about 400MB in size (including the index), and there is no wordsearch or wordmatch table to worry about any more.

The end results are that posting is just about as fast as it was when I initially installed phpBB, even on larger posts. Searching is quick and the change is seemless to the end-user. I use less disk space, and someday I will prolly use the scoring capabilities and some of the more advanced DB-level functions to enhance my searches.

So.... Here's my Mod in ALPHA version. Note that it's not complete, and that I made a mistake in the initial way I indexed the table. I will probably leave my mistake in and remove the "Don't search the subject" option from my forums. That's not necessary, but this MOD will need more tweaking before it's ready for prime-time. Further note that this mod is ONLY for MySQL. If your DB supports full-text indexing, then you can probably modify what I have done pretty easily.

Anyway, this is only a beginning. Use this at your own risk, and MAKE SURE you back up your tables. Also, if you have a good sized site, then you're looking at a very long time to make the index (mine took about six hours).

Comments welcome, improvements especially so. Although this is not entirely done, I think it's close enough to turn it loose on the community to see what good ideas other people have. :)

Code: Select all

################################################################# 
## Mod Title: Full-Text Index searching (in progress)
## Mod Version: 0.1.0 ALPHA
## Author: Benjamin Ellington < ben@cyberjag.com > - http://www.gamejag.com
##
## Description: Replaces phpBB's search methods with one that supports full-text indices
## 
## Installation Level: advanced
## Installation Time: 30-60 Minutes, plus re-indexing time
## Files To Edit:
##
##   \modcp.php
##   \search.php
##   \includes\constants.php
##   \includes\functions_search.php
##   \includes\functions_post.php
##   \includes\prune.php
##
################################################################# 
## Before Adding This MOD To Your Forum, You Should Back Up All Files 
## Related To This MOD and your database.  
################################################################# 
##
## Installation instructions:
##
## 1 - One significant SQL database change is required, and two are optional. Open 
##	   PhpMyAdmin or another SQL command prompt and execute the provided instructions.
##	   Make sure the phpbb_ is replaced with your database name if different.
##
## 2 - If you have already installed other mods that affect the same fIles, you need
##	   to take care with the changes made.  Make sure you don't disable them in the 
##	   process of installing this one.
##
## 3 - Because of the initial requirements in creating the index, I highly
##     recommend you create the new indexes first, and once they are successfully
##     created then proceed with the rest of the MOD.  Large boards can take several
##     hours to index.
##
#################################################################
##
## Revision history since 0.9.3
##
## 0.1.0  Alpha version released
##
#################################################################
#
#-----[ SQL ]------------------------------------------
#

ALTER TABLE phpbb_posts_text ADD FULLTEXT (post_subject,post_text);

#  (This will take FOREVER!  Also, this is marked for change as it's not the best 
#  way to do it.  You should make separate indices for the text field and the subject
#  field and rewrite the search function accordingly.)


# 
#-----[ OPEN ]------------------------------------------ 
# 

     modcp.php

# 
#-----[ FIND NEAR LINE 337 ]------------------------------------------ 
# 

     remove_search_post($post_id_sql);

# 
#-----[ DELETE OR COMMENT OUT ]------------------------------------------ 
# 

     remove_search_post($post_id_sql);

# 
#-----[ OPEN ]------------------------------------------ 
# 

     search.php

# 
#-----[ FIND NEAR LINE 253 ]------------------------------------------ 
# 

     $search_msg_only = ( !$search_fields ) ? "AND m.title_match = 0" : ( ( strstr($multibyte_charset, $lang['ENCODING']) ) ? '' : '' );

# 
#-----[ DELETE OR COMMENT OUT ]------------------------------------------ 
# 

     $search_msg_only = ( !$search_fields ) ? "AND m.title_match = 0" : ( ( strstr($multibyte_charset, $lang['ENCODING']) ) ? '' : '' );

# 
#-----[ FIND NEAR LINE 283 ]------------------------------------------ 
# 

     if ( !strstr($multibyte_charset, $lang['ENCODING']) )
     {
       $match_word = str_replace('*', '%', $split_search[$i]);
       $sql = "SELECT m.post_id 
       FROM " . SEARCH_WORD_TABLE . " w, " . SEARCH_MATCH_TABLE . " m 
       WHERE w.word_text LIKE '$match_word' 
         AND m.word_id = w.word_id 
         AND w.word_common <> 1 
         $search_msg_only";
     }

# 
#-----[ CHANGE TO ]------------------------------------------ 
# 

     if ( !strstr($multibyte_charset, $lang['ENCODING']) )
     {
       $match_word = str_replace('*', '%', $split_search[$i]);
//       if (!$search_fields)
//       {
//         $sql = "SELECT post_id
//         FROM " . POSTS_TEXT_TABLE . " 
//         WHERE MATCH (post_text) AGAINST ('$match_word')";
//       }
//       else
//       {
           $sql = "SELECT post_id
           FROM " . POSTS_TEXT_TABLE . " 
           WHERE MATCH (post_subject,post_text) AGAINST ('$match_word')";
//		}

#  (Notice the commented out lines.  That's because I indexed two fields together 
#  instead of making two different indexes--see the initial SQL statement in this mod)

# 
#-----[ OPEN ]------------------------------------------ 
# 

     includes/constants.php

# 
#-----[ FIND NEAR LINE 166 ]------------------------------------------ 
# 

     define('SEARCH_WORD_TABLE', $table_prefix.'search_wordlist');
     define('SEARCH_MATCH_TABLE', $table_prefix.'search_wordmatch');

# 
#-----[ DELETE OR COMMENT OUT ]------------------------------------------ 
# 

     define('SEARCH_WORD_TABLE', $table_prefix.'search_wordlist');
     define('SEARCH_MATCH_TABLE', $table_prefix.'search_wordmatch');

# 
#-----[ OPEN ]------------------------------------------ 
# 

     includes/functions_post.php

# 
#-----[ FIND NEAR LINE 285 ]------------------------------------------ 
# 


     add_search_words('single', $post_id, stripslashes($post_message), stripslashes($post_subject));

# 
#-----[ DELETE OR COMMENT OUT ]------------------------------------------ 
# 

     add_search_words('single', $post_id, stripslashes($post_message), stripslashes($post_subject));

# 
#-----[ FIND NEAR LINE 246 ]------------------------------------------ 
# 

     if ($mode == 'editpost')
     {
       remove_search_post($post_id);
     }

# 
#-----[ DELETE OR COMMENT OUT ]------------------------------------------ 
# 

     if ($mode == 'editpost')
     {
       remove_search_post($post_id);
     }

# 
#-----[ FIND NEAR LINE 528 ]------------------------------------------ 
# 

     remove_search_post($post_id);

# 
#-----[ DELETE OR COMMENT OUT ]------------------------------------------ 
# 
			
     remove_search_post($post_id);

# 
#-----[ OPEN ]------------------------------------------ 
# 

     includes/functions_search.php

# 
#-----[ FIND NEAR LINE 105 ]------------------------------------------ 
# 

     function add_search_words($mode, $post_id, $post_text, $post_title = '')

# 
#-----[ DELETE OR COMMENT OUT ]------------------------------------------ 
# 

     function add_search_words($mode, $post_id, $post_text, $post_title = '')
     {
	     global $db, $phpbb_root_path, $board_config, $lang;

	     $stopword_array = @file($phpbb_root_path . 'language/lang_' . $board_config['default_lang'] . "/search_stopwords.txt"); 
	     $synonym_array = @file($phpbb_root_path . 'language/lang_' . $board_config['default_lang'] . "/search_synonyms.txt"); 

	     $search_raw_words = array();
	     $search_raw_words['text'] = split_words(clean_words('post', $post_text, $stopword_array, $synonym_array));
	     $search_raw_words['title'] = split_words(clean_words('post', $post_title, $stopword_array, $synonym_array));

	     @set_time_limit(0);

	     $word = array();
	     $word_insert_sql = array();
	     while ( list($word_in, $search_matches) = @each($search_raw_words) )
	     {
		     $word_insert_sql[$word_in] = '';
		     if ( !empty($search_matches) )
		     {
			     for ($i = 0; $i < count($search_matches); $i++)
			     { 
				     $search_matches[$i] = trim($search_matches[$i]);

				     if( $search_matches[$i] != '' ) 
				     {
					     $word[] = $search_matches[$i];
					     if ( !strstr($word_insert_sql[$word_in], "'" . $search_matches[$i] . "'") )
					     {
						     $word_insert_sql[$word_in] .= ( $word_insert_sql[$word_in] != "" ) ? ", '" . $search_matches[$i] . "'" : "'" . $search_matches[$i] . "'";
					     }
				     } 
			     }
		     }
	     }

	     if ( count($word) )
	     {
		     sort($word);

		     $prev_word = '';
		     $word_text_sql = '';
		     $temp_word = array();
		     for($i = 0; $i < count($word); $i++)
		     {
			     if ( $word[$i] != $prev_word )
			     {
				     $temp_word[] = $word[$i];
				     $word_text_sql .= ( ( $word_text_sql != '' ) ? ', ' : '' ) . "'" . $word[$i] . "'";
			     }
			     $prev_word = $word[$i];
		     }
		     $word = $temp_word;

		     $check_words = array();
		     switch( SQL_LAYER )
		     {
			     case 'postgresql':
			     case 'msaccess':
			     case 'mssql-odbc':
			     case 'oracle':
			     case 'db2':
				     $sql = "SELECT word_id, word_text     
					     FROM " . SEARCH_WORD_TABLE . " 
					     WHERE word_text IN ($word_text_sql)";
				     if ( !($result = $db->sql_query($sql)) )
				     {
					     message_die(GENERAL_ERROR, 'Could not select words', '', __LINE__, __FILE__, $sql);
				     }

				     while ( $row = $db->sql_fetchrow($result) )
				     {
					     $check_words[$row['word_text']] = $row['word_id'];
				     }
				     break;
		     }

		     $value_sql = '';
		     $match_word = array();
		     for ($i = 0; $i < count($word); $i++)
		     { 
			     $new_match = true;
			     if ( isset($check_words[$word[$i]]) )
			     {
				     $new_match = false;
			     }

			     if ( $new_match )
			     {
				     switch( SQL_LAYER )
				     {
					     case 'mysql':
					     case 'mysql4':
						     $value_sql .= ( ( $value_sql != '' ) ? ', ' : '' ) . '(\'' . $word[$i] . '\', 0)';
						     break;
					     case 'mssql':
						     $value_sql .= ( ( $value_sql != '' ) ? ' UNION ALL ' : '' ) . "SELECT '" . $word[$i] . "', 0";
						     break;
					     default:
						     $sql = "INSERT INTO " . SEARCH_WORD_TABLE . " (word_text, word_common) 
							     VALUES ('" . $word[$i] . "', 0)"; 
						     if( !$db->sql_query($sql) )
						     {
							     message_die(GENERAL_ERROR, 'Could not insert new word', '', __LINE__, __FILE__, $sql);
						     }
						     break;
				     }
			     }
		     }

		     if ( $value_sql != '' )
		     {
			     switch ( SQL_LAYER )
			     {
				     case 'mysql':
				     case 'mysql4':
					     $sql = "INSERT IGNORE INTO " . SEARCH_WORD_TABLE . " (word_text, word_common) 
						     VALUES $value_sql"; 
					     break;
				     case 'mssql':
					     $sql = "INSERT INTO " . SEARCH_WORD_TABLE . " (word_text, word_common) 
						     $value_sql"; 
					     break;
			     }

			     if ( !$db->sql_query($sql) )
			     {
				     message_die(GENERAL_ERROR, 'Could not insert new word', '', __LINE__, __FILE__, $sql);
			     }
		     }
	     }

	     while( list($word_in, $match_sql) = @each($word_insert_sql) )
	     {
		     $title_match = ( $word_in == 'title' ) ? 1 : 0;

		     if ( $match_sql != '' )
		     {
			     $sql = "INSERT IGNORE INTO " . SEARCH_MATCH_TABLE . " (post_id, word_id, title_match) 
				     SELECT $post_id, word_id, $title_match  
					     FROM " . SEARCH_WORD_TABLE . " 
					     WHERE word_text IN ($match_sql)"; 
			     if ( !$db->sql_query($sql) )
			     {
				     message_die(GENERAL_ERROR, 'Could not insert new word matches', '', __LINE__, __FILE__, $sql);
			     }
		     }
	     }

	     if ($mode == 'single')
	     {
		     remove_common('single', 4/10, $word);
	     }

	     return;
     }

# 
#-----[ FIND NEAR LINE 265 ]------------------------------------------ 
# 

     function remove_common($mode, $fraction, $word_id_list = array())

# 
#-----[ DELETE OR COMMENT OUT ]------------------------------------------ 
# 

     function remove_common($mode, $fraction, $word_id_list = array())
     {
	     global $db;

	     $sql = "SELECT COUNT(post_id) AS total_posts 
		     FROM " . POSTS_TABLE;
	     if ( !($result = $db->sql_query($sql)) )
	     {
		     message_die(GENERAL_ERROR, 'Could not obtain post count', '', __LINE__, __FILE__, $sql);
	     }

	     $row = $db->sql_fetchrow($result);

	     if ( $row['total_posts'] >= 100 )
	     {
		     $common_threshold = floor($row['total_posts'] * $fraction);

		     if ( $mode == 'single' && count($word_id_list) )
		     {
			     $word_id_sql = '';
			     for($i = 0; $i < count($word_id_list); $i++)
			     {
				     $word_id_sql .= ( ( $word_id_sql != '' ) ? ', ' : '' ) . "'" . $word_id_list[$i] . "'";
			     }

			     $sql = "SELECT m.word_id 
				     FROM " . SEARCH_MATCH_TABLE . " m, " . SEARCH_WORD_TABLE . " w 
				     WHERE w.word_text IN ($word_id_sql)  
					     AND m.word_id = w.word_id 
				     GROUP BY m.word_id 
				     HAVING COUNT(m.word_id) > $common_threshold";
		     }
		     else 
		     {
			     $sql = "SELECT word_id 
				     FROM " . SEARCH_MATCH_TABLE . " 
				     GROUP BY word_id 
				     HAVING COUNT(word_id) > $common_threshold";
		     }

		     if ( !($result = $db->sql_query($sql)) )
		     {
			     message_die(GENERAL_ERROR, 'Could not obtain common word list', '', __LINE__, __FILE__, $sql);
		     }

		     $common_word_id = '';
		     while ( $row = $db->sql_fetchrow($result) )
		     {
			     $common_word_id .= ( ( $common_word_id != '' ) ? ', ' : '' ) . $row['word_id'];
		     }
		     $db->sql_freeresult($result);

		     if ( $common_word_id != '' )
		     {
			     $sql = "UPDATE " . SEARCH_WORD_TABLE . "
				     SET word_common = " . TRUE . " 
				     WHERE word_id IN ($common_word_id)";
			     if ( !$db->sql_query($sql) )
			     {
				     message_die(GENERAL_ERROR, 'Could not delete word list entry', '', __LINE__, __FILE__, $sql);
			     }

			     $sql = "DELETE FROM " . SEARCH_MATCH_TABLE . "  
				     WHERE word_id IN ($common_word_id)";
			     if ( !$db->sql_query($sql) )
			     {
				     message_die(GENERAL_ERROR, 'Could not delete word match entry', '', __LINE__, __FILE__, $sql);
			     }
		     }
	     }

	     return;
     }

# 
#-----[ FIND NEAR LINE 339 ]------------------------------------------ 
# 

     function remove_search_post($post_id_sql)

# 
#-----[ DELETE OR COMMENT OUT ]------------------------------------------ 
# 

     function remove_search_post($post_id_sql)
     {
	     global $db;

	     $words_removed = false;

	     switch ( SQL_LAYER )
	     {
		     case 'mysql':
		     case 'mysql4':
			     $sql = "SELECT word_id 
				     FROM " . SEARCH_MATCH_TABLE . " 
				     WHERE post_id IN ($post_id_sql) 
				     GROUP BY word_id";
			     if ( $result = $db->sql_query($sql) )
			     {
				     $word_id_sql = '';
				     while ( $row = $db->sql_fetchrow($result) )
				     {
					     $word_id_sql .= ( $word_id_sql != '' ) ? ', ' . $row['word_id'] : $row['word_id']; 
				     }

				     $sql = "SELECT word_id 
					     FROM " . SEARCH_MATCH_TABLE . " 
					     WHERE word_id IN ($word_id_sql) 
					     GROUP BY word_id 
					     HAVING COUNT(word_id) = 1";
				     if ( $result = $db->sql_query($sql) )
				     {
					     $word_id_sql = '';
					     while ( $row = $db->sql_fetchrow($result) )
					     {
						     $word_id_sql .= ( $word_id_sql != '' ) ? ', ' . $row['word_id'] : $row['word_id']; 
					     }

					     if ( $word_id_sql != '' )
					     {
						     $sql = "DELETE FROM " . SEARCH_WORD_TABLE . " 
							     WHERE word_id IN ($word_id_sql)";
						     if ( !$db->sql_query($sql) )
						     {
							     message_die(GENERAL_ERROR, 'Could not delete word list entry', '', __LINE__, __FILE__, $sql);
						     }

						     $words_removed = $db->sql_affectedrows();
					     }
				     }
			     }
			     break;

		     default:
			     $sql = "DELETE FROM " . SEARCH_WORD_TABLE . " 
				     WHERE word_id IN ( 
					     SELECT word_id 
					     FROM " . SEARCH_MATCH_TABLE . " 
					     WHERE word_id IN ( 
						     SELECT word_id 
						     FROM " . SEARCH_MATCH_TABLE . " 
						     WHERE post_id IN ($post_id_sql) 
						     GROUP BY word_id 
					     ) 
					     GROUP BY word_id 
					     HAVING COUNT(word_id) = 1
				     )"; 
			     if ( !$db->sql_query($sql) )
			     {
				     message_die(GENERAL_ERROR, 'Could not delete old words from word table', '', __LINE__, __FILE__, $sql);
			     }

			     $words_removed = $db->sql_affectedrows();

			     break;
	     }

	     $sql = "DELETE FROM " . SEARCH_MATCH_TABLE . "  
		     WHERE post_id IN ($post_id_sql)";
	     if ( !$db->sql_query($sql) )
	     {
		     message_die(GENERAL_ERROR, 'Error in deleting post', '', __LINE__, __FILE__, $sql);
	     }

	     return $words_removed;
     }


# 
#-----[ OPEN ]------------------------------------------ 
# 

     includes/prune.php

# 
#-----[ FIND NEAR LINE 111 ]------------------------------------------ 
# 

     $sql = "DELETE FROM " . SEARCH_MATCH_TABLE . "

# 
#-----[ DELETE OR COMMENT OUT (May not exist in newer installations) ]------------------------------------------ 
# 

     $sql = "DELETE FROM " . SEARCH_MATCH_TABLE . " 
       WHERE post_id IN ($sql_post)";
     if ( !$db->sql_query($sql) )
     {
       message_die(GENERAL_ERROR, 'Could not delete search matches', '', __LINE__, __FILE__, $sql);
     }

# 
#-----[ FIND NEAR LINE 113 ]------------------------------------------ 
# 

     remove_search_post($sql_post);

# 
#-----[ DELETE OR COMMENT OUT ]------------------------------------------ 
# 

     remove_search_post($sql_post);

#
#-----[ SQL ]------------------------------------------
#

DROP TABLE phpbb_posts_search_wordtext;
DROP TABLE phpbb_posts_search_wordmatch;

# 
#-----[ SAVE/CLOSE ALL FILES ]------------------------------------------ 
# 
# EoM

White Dust
Registered User
Posts: 396
Joined: Thu Mar 07, 2002 10:51 pm
Location: Sigel: The City of Doors

Post by White Dust »

Hmmz... nice little mod :)

But I have a little problem, Im getting an error when I hit the search:

Code: Select all

Parse error: parse error, unexpected T_ELSE in /www/htdocs/fantreal/forum/search.php on line 298
That is in the following bit of code:

Code: Select all

     if ( !strstr($multibyte_charset, $lang['ENCODING']) )
     {
       $match_word = str_replace('*', '%', $split_search[$i]);
//       if (!$search_fields)
//       {
//         $sql = "SELECT post_id
//         FROM " . POSTS_TEXT_TABLE . "
//         WHERE MATCH (post_text) AGAINST ('$match_word')";
//       }
//       else
//       {
           $sql = "SELECT post_id
           FROM " . POSTS_TEXT_TABLE . "
           WHERE MATCH (post_subject,post_text) AGAINST ('$match_word')";
//       } 

(line 298)  		else
						{
							$match_word =  addslashes('%' . str_replace('*', '', $split_search[$i]) . '%');
							$search_msg_only = ( $search_fields ) ? "OR post_subject LIKE '$match_word'" : '';
							$sql = "SELECT post_id
								FROM " . POSTS_TEXT_TABLE . "
								WHERE post_text LIKE '$match_word'
								$search_msg_only";
						}
Last edited by White Dust on Sun Sep 07, 2003 8:11 pm, edited 2 times in total.

User avatar
GPHemsley
Registered User
Posts: 1475
Joined: Fri Apr 18, 2003 3:12 am
Location: Long Beach, NY
Name: Gordon Hemsley
Contact:

Post by GPHemsley »

So... where'd you get that MOD template from? Mars? :?

Simon Moon
Registered User
Posts: 26
Joined: Mon Jul 29, 2002 10:30 am
Contact:

Post by Simon Moon »

Thanks, this is a BIG boost, i would sguesst using this in the official version, so you can switch between mysql4 and "others" ...

Please keep working on this, we could sue it VERY good....

User avatar
Lucas1510
Registered User
Posts: 47
Joined: Thu Sep 04, 2003 12:07 am
Location: Herne - Germany
Contact:

Post by Lucas1510 »

Hi,

I startet my Board in July this year. It is not very "big", but it is growing. Do you think, I should make this modification now, before I have too much posts?

So the Indexing would then cost not much time, because I started "in time" with this mod.
Gamejag/Code wrote: Installation Time: 30-60 Minutes, plus re-indexing time


What do you think?

Greetings,

Lucas

Simon Moon
Registered User
Posts: 26
Joined: Mon Jul 29, 2002 10:30 am
Contact:

Post by Simon Moon »

I would go for it asap. Just keep in mind as soon as there is an update for any of the modified files, you have to make sure your search and everything works again. If you can live with that, this is really a GREAT mod to speed things up.

I ran a small test on a cutdown backup of our db, its a LOT faster, but not really completed yet as the author mentioned. As soon as the searches and indices are optimized, i will run it on our "big" board :)

Gamejag
Registered User
Posts: 16
Joined: Mon Mar 31, 2003 6:46 pm

Post by Gamejag »

White Dust wrote: But I have a little problem, Im getting an error when I hit the search:

Code: Select all

Parse error: parse error, unexpected T_ELSE in /www/htdocs/fantreal/forum/search.php on line 298
Welcome to Alpha-land! ;) Get rid of the commented out lines, and make the finished product look like this:

Code: Select all

            if ( !strstr($multibyte_charset, $lang['ENCODING']) )
            {
              $match_word = str_replace('*', '%', $split_search[$i]);
              $sql = "SELECT post_id 
                FROM " . POSTS_TEXT_TABLE . " 
                WHERE MATCH (post_subject,post_text) AGAINST ('$match_word')"; 
            }
						else //(Your line 298)
GPHemsley wrote: So... where'd you get that MOD template from? Mars? :?
I think I copied it from either the Log Actions Mod or the custom title Mod. If you were just trying to be funny, please disregard that answer. :)

After a week, there have been no complaints or reports of buts, and nothing but happy thoughts regarding the increase in posting speed. The loss of subject exclusion from searching hasn't bothered anyone a bit, but then again my forums tend to have very long posts so those types of searches probably aren't too common for me. Also, the loss was due to a mistake I made when I indexed my table, and doesn't have to be for anyone who installs this on their own forums.

Note that this is a USELESS Mod for small forums. The emulation that phpBB provides for full-text searching is pretty good, and it's only the database speed for very large tables that causes problems for larger boards. Also, since this Mod will not work on all supported DB platforms there is no way it will make it into an official release. Most of the common and popular databases support full-text indexing, but there are still several that don't and until every one does, this Mod will remain on the fringe, supporting the niche of very large installations.

Since there is some interest in it, does anyone want to play around with the native scoring functionality you can get in mySQL from full-text searches? As I mentioned in the initial post, it does open up some more search possibilities but I am either just too lazy to code them or have priorities in other areas. :)

Simon Moon
Registered User
Posts: 26
Joined: Mon Jul 29, 2002 10:30 am
Contact:

Post by Simon Moon »

Well, if you do it right you could do it conditionally, as preparation for other fulltext indices capable DBs :) But thats dreaming, i know, its hard to get somehting into a project this open as phpbb.

What i am more concernd about is the speed of my board and currently, its HORRIBLE. I could just as well shut it down, wouldnt make much of a difference. If this mod works out well, this could be another step to make it faster (besides the web cluster it will run on soon).

White Dust
Registered User
Posts: 396
Joined: Thu Mar 07, 2002 10:51 pm
Location: Sigel: The City of Doors

Post by White Dust »

Gamejag wrote: [...]


Ah, it worked :)
thanks 8)

User avatar
GPHemsley
Registered User
Posts: 1475
Joined: Fri Apr 18, 2003 3:12 am
Location: Long Beach, NY
Name: Gordon Hemsley
Contact:

Post by GPHemsley »

Gamejag wrote:
GPHemsley wrote:So... where'd you get that MOD template from? Mars? :?
I think I copied it from either the Log Actions Mod or the custom title Mod. If you were just trying to be funny, please disregard that answer. :)

Well, I was trying to be funny, yet informative: That MOD template is so absolutely wrong, it's not... well.. funny. :|

I suggest checking these out and re-doing the MOD template:
http://www.phpbb.com/kb/article.php?article_id=38
http://www.phpbb.com/kb/article.php?article_id=39

And just FYI, I haven't even touched this MOD, just trying to help. ;)

JantjeP
Registered User
Posts: 7
Joined: Sun Jan 12, 2003 11:35 pm
Contact:

Post by JantjeP »

Thanks,

My MySQL database is 30 percent smaller now.
I had only 526000 words in the wordmatch table.
Searchspeed is OK.

Jan

kolczyk
Registered User
Posts: 73
Joined: Sat Dec 14, 2002 8:57 pm

Post by kolczyk »

what about use * instead of letters, how much is speed of search with * ? try search for example m*** or b*** etc. I had really big problems when user used * in search box, page was loading very slowly and I have many mysqld process opened.
http://www.phpbb.com/phpBB/viewtopic.php?t=135142
database went down on PIV 3GHz just because one user search something
I have disabled * and now it's working

so maybe when I install this mod everything will be fine? and users will be had permission to use * in box?

please let me know

JantjeP
Registered User
Posts: 7
Joined: Sun Jan 12, 2003 11:35 pm
Contact:

Post by JantjeP »

kolczyk,

you can test the mod here http://www.officeforum.nl/forum/search.php

Its a Dutch forum for MS Office questions, but i think you know where you can find the search function.

Jan.

kolczyk
Registered User
Posts: 73
Joined: Sat Dec 14, 2002 8:57 pm

Post by kolczyk »

thanks, now I know that * is not working :(

tried search adres with 164 results and next adre* with zero results :(
and many other words, however * is not working :(

bbgunner
Registered User
Posts: 18
Joined: Sat Jun 01, 2002 5:25 pm

Post by bbgunner »

I made this mod about a month ago, here is my code for the search part:

Code: Select all

else if ( $search_keywords != '' )
		{
			$search_msg_only = (!$search_fields) ? '' : ',t.post_subject';
			
			$word_count = 0;
						
			//$sql = "SELECT m.post_id
			//	FROM " . SEARCH_WORD_TABLE . " w, " . SEARCH_MATCH_TABLE . " m
			//	WHERE w.word_text LIKE '$match_word'
			//		AND m.word_id = w.word_id
			//		AND w.word_common <> 1
			//		$search_msg_only";

//Yes, I could do this in one query statement but it would look ugly
			if ($search_forum == -1) {
                $sql = "SELECT t.post_id FROM " . POSTS_TEXT_TABLE . " t
				WHERE MATCH (t.post_text $search_msg_only) AGAINST ('$search_keywords' IN BOOLEAN MODE) LIMIT 1000";		
			} else {
			    $sql = "SELECT t.post_id FROM " . POSTS_TEXT_TABLE . " t, " . POSTS_TABLE ." p
				WHERE p.forum_id = $search_forum AND p.post_id = t.post_id AND MATCH (t.post_text $search_msg_only) AGAINST ('$search_keywords' IN BOOLEAN MODE) LIMIT 1000";
			}
					
			if ( !($result = $db->sql_query($sql)) )
			{
				message_die(GENERAL_ERROR, 'Could not obtain matched posts list', '', __LINE__, __FILE__, $sql);
			}

$search_ids = array();
			//Dump the postids in the search array;
			while ( $search_rows = mysql_fetch_array($result) )
			{
				$search_ids[] = $search_rows['post_id'];
		
			}
			//
			$total_match_count = mysql_num_rows($result);
			//Make the array for highlighting and also remove the binary operators
			$search_keywords = preg_replace('([\\\+\-><()~*:"&|\'])','',$search_keywords);
			$split_search = split(' ',$search_keywords);
			//Let the search go free
			$db->sql_freeresult($result);
		}

I have code that made sure the search only searched through the forum the user selected. There is a query limit in there from keeping the query from consuming too many resources. Also I use boolean search mode, which uses things like *,? and such. There some other fixes in there to keep phpbb happy.

If you make the indexs in 4.0.x they go MUCH faster.

Locked

Return to “[2.0.x] MODs in Development”