Feb 25, 2006 updates
Changed to ALPHA status, assigned version 0.2.0
Posted screen shots and detailed development notes in this post later in this topic.
No code is available for downloading for this version
*** END 0.2.0 Notes ***
*** Original post text below ***
One of the more common complaints from any size board is the size of the phpbb_search_wordmatch table. Short summary: this table contains the association between words in phpbb_search_wordlist and the posts that those words appear in. This table is crucial to support the standard phpBB search algorithm; if you empty this table, you cannot search the posts on your board.
When words reach a certain threshold, they becomes less useful (and more stressful) to use as search terms. One answer to this is to update your stopwords file. There are two issues ... one is that the file is a text file (rather than a database table) so the only way to insert / update / delete is to open the file directly. The second is that adding a word to the stopwords list does not remove current records from the phpbb_search_wordmatch table.
I am tossing this idea out for development. It could be a nice tool for board admins. I would envision this being an entry for the ACP with several options.
- Edit stopwords file
- Provide a way to edit the stopwords table
- Track changes so that new stopwords will be removed from the phpbb_search_wordmatch and phpbb_search_wordlist tables
- Stopword Suggestions
- Scan current phpbb_search_wordmatch table, counting instances of a word
- List most common words as candidates for new stopwords
- Provide easy interface for a "one-click" stopword add
Code: Select all
mysql> select w.word_id, w.word_text, count(*) as word_instances
-> from phpbb_search_wordlist w
-> , phpbb_search_wordmatch m
-> where w.word_id = m.word_id
-> group by w.word_id, w.word_text
-> order by word_instances desc
-> limit 25;
+---------+-----------+----------------+
| word_id | word_text | word_instances |
+---------+-----------+----------------+
| 92 | code | 417 |
| 36 | work | 266 |
| 44 | forum | 259 |
| 68 | i | 253 |
| 32 | php | 247 |
| 1787 | think | 233 |
| 23 | site | 218 |
| 3 | phpbb | 216 |
| 372 | user | 213 |
| 127 | make | 211 |
| 25 | done | 199 |
| 1342 | change | 188 |
| 1439 | ill | 184 |
| 1343 | changes | 184 |
| 3837 | phase | 181 |
| 1531 | post | 178 |
| 20 | board | 177 |
| 1855 | server | 177 |
| 96 | database | 177 |
| 1576 | set | 172 |
| 22 | test | 171 |
| 63 | back | 166 |
| 1405 | file | 166 |
| 31 | ive | 159 |
| 2394 | let | 158 |
+---------+-----------+----------------+
25 rows in set (0.50 sec)
Of course this MOD could be written to include the ability to edit the search_synonyms.txt file as well, but that has less impact on the search performance.
Thoughts? Suggestions? Good idea? Bad idea?