[DEV] mysql fulltext search

A place for MOD Authors to post and receive feedback on MODs still in development. No MODs within this forum should be used within a live environment! No new topics are allowed in this forum.
Forum rules
READ: phpBB.com Board-Wide Rules and Regulations

IMPORTANT: MOD Development Forum rules

On February 1, 2009 this forum will be set to read only as part of retiring of phpBB2.
User avatar
drathbun
Former Team Member
Posts: 12204
Joined: Thu Jun 06, 2002 3:51 pm
Location: TOPICS_TABLE
Contact:

Re: [DEV] mysql fulltext search

Post by drathbun » Sun Apr 29, 2007 8:33 pm

My mistake, in reading your post again you did, in fact, say just that. I skipped the important details because I thought I knew what you were saying. My apologies.

I have not tried the code posted yet; have not had time. So I can't confirm how it works.

If you have tried this and gone back to the "original" file, and you have a good-sized stopwords file, you would probably be interested in some work I've done in that area. I put the stopwords into the database (instead of a text file) and changed how they are processed, and it made a big difference in post processing. It didn't do anything to the database size, of course, since it didn't change the content. But if you use a "Rebuild Search index" MOD of any kind it should help.

I have the code in a prototype board. It's nowhere near ready for release. And I should not take this topic in another direction, so will shut up here. :-) If you want to read more about it you can click the "Search" category on my blog, see signature for link.
I blog about phpBB: phpBBDoctor blog
Still using phpbb2? So am I! Click below for details
Image

arod-1
Registered User
Posts: 1327
Joined: Mon Sep 20, 2004 1:33 pm

Re: [DEV] mysql fulltext search

Post by arod-1 » Mon Apr 30, 2007 4:11 am

IPB_Refugee wrote: Hi,
regarding to the similar Fulltext Search MOD I mentioned above: Adding fulltext to the table phpbb_posts_text increased the size of this table more than 6mb. My two search tables had in summary only 4.7mb. So I had no saving of web space but the exact opposite.

Regards
Wolfgang
actually, adding the fulltext index does not increase the size of the table at all.
it does increase the size of the index.
my question:
when you look at the size of the 2 tables truncated, did you count their indexes also? the index of those 2 tables is generally slightly larger than the tables themselves. (in phpbb3 the indexes actually consume significantly more than the tables themselves, because olympus has some extra indexes on these 2 tables).

i can't check how much space is consumed on your data. on my data, the increment in size of the index is significantly less than the total size consumed by those 2 tables and their indexes.

one last thing:
when you remove this (or the previous) mod, you want to make sure to drop the fulltext index, otherwise you continue to pay both in space and in execution time for both of those search mechanisms, even though only one of them is actually used in searches.
IPB_Refugee wrote: BTW: If you use a SEO-MOD which creates dynamical meta tags for your topics you need the search tables - you should consider this before truncating them.
truncating those tables, and removing the need to keep them updated is the only reason to use this MOD. if you need these tables, then for sure this MOD is not for you.
standard disclaimer:
backup your db and files before you do anything.
absolutely no guarantee.
if you do what i advise and it blows in your face, all you'll hear from me is: "ah... sorry, i guess"

User avatar
IPB_Refugee
Registered User
Posts: 1290
Joined: Fri Jul 07, 2006 2:25 pm
Location: Austria
Name: Wolfgang Weber

Re: [DEV] mysql fulltext search

Post by IPB_Refugee » Mon Apr 30, 2007 6:44 am

Hi arod-1!
arod-1 wrote: my question:
when you look at the size of the 2 tables truncated, did you count their indexes also? the index of those 2 tables is generally slightly larger than the tables themselves.


I just looked at the regular table overview with phpMyAdmin which you get when you click on the name of the database.
arod-1 wrote: one last thing:
when you remove this (or the previous) mod, you want to make sure to drop the fulltext index, otherwise you continue to pay both in space and in execution time for both of those search mechanisms, even though only one of them is actually used in searches.


Thanks for adding this info. I´ve already done it. It had taken me about half an hour to find a way to remove the fulltext because I´m not so good in MySQL. I can´t remember how I´ve done it but I´m sure that I´ve done it...

Regards
Wolfgang

arod-1
Registered User
Posts: 1327
Joined: Mon Sep 20, 2004 1:33 pm

Re: [DEV] mysql fulltext search

Post by arod-1 » Mon Apr 30, 2007 11:26 am

Small change to the MOD file:
it now passes validation by the MOD validator (the validator was offline since i oroginally posted the MOD until yesterday)
standard disclaimer:
backup your db and files before you do anything.
absolutely no guarantee.
if you do what i advise and it blows in your face, all you'll hear from me is: "ah... sorry, i guess"

arod-1
Registered User
Posts: 1327
Joined: Mon Sep 20, 2004 1:33 pm

Re: [DEV] mysql fulltext search

Post by arod-1 » Mon Apr 30, 2007 1:09 pm

IPB_Refugee wrote: I just looked at the regular table overview with phpMyAdmin which you get when you click on the name of the database.

as i said, only you can look in your data, but here are the results from my test board.
note that this is a phpbb-3 board. the "phpbb-native" tables in olympus are somewhat larger than they are in phpbb-2, so, as they say, "your mileage may vary", but it's not that far off. all sizes in KB:

Code: Select all

#posts: 3266 
                    table- size, index size  total          
posts_table:           4,782     4,253       9,035 
search_wordlist:       1,729     1,537       3,266
search_wordmatch       1,831     8,942       8,963
note that you can't drop posts_table data, so you really want to compare posts_table index size with the total of the other two.
it's worth noting that the total db size used by the board, (with wordlist/wordmatch and without the fulltext index) is 17.5 MB, ~12 of which are those 2 tables. clearly more than 50%.
when those tables are replaced with the fulltext index, total size shrinks to 9.5 MB.

as i noted, this is the olympus stats.
i couldn't give you the phpbb-2 stats, because phpbb-2 does not have a built-in "rebuild search index" functionality, and i am not going to install a MOD just for this test.
with phpbb-2 the results are expected to be slightly less dramatic, because the search-wordmatch table has one index less (otoh, olympus begins to drop words from the search tables once they appear in 5% of the posts, which creates a skew in the other direction...).

my mysql settings is to index words of any length (default is length of 4 or greater), and not use the built-in stop-word list.
standard disclaimer:
backup your db and files before you do anything.
absolutely no guarantee.
if you do what i advise and it blows in your face, all you'll hear from me is: "ah... sorry, i guess"

User avatar
qspypl
Registered User
Posts: 149
Joined: Wed Dec 20, 2006 7:15 pm

Re: [DEV] mysql fulltext search

Post by qspypl » Mon Apr 30, 2007 6:58 pm

I will implement this tonight,

319501 posts
79006 users

search is killing my server... queries took up to 10 seconds... so i will posts results of testing tommorow

I will implement this tonight,

319501 posts
79006 users

search is killing my server... queries took up to 10 seconds... so i will posts results of testing tommorow

btw Well, actually with my old search is smth bad :P
# Time: 070430 18:11:22
# User@Host: root[root] @ localhost []
# Query_time: 12 Lock_time: 1 Rows_sent: 25 Rows_examined: 344616
SELECT t.*, f.forum_id, f.forum_name, u.user_rank, u.user_posts, u2.user_rank AS rank2, u2.user_posts AS posts2, u.username, u.user_id, u2.username as user2, u2.user_id as id2, p.post_username, p2.post_username AS post_username2, p2.post_time ^M
FROM phpbb_topics t, phpbb_forums f, phpbb_users u, phpbb_posts p, phpbb_posts p2, phpbb_users u2
WHERE t.topic_id IN (11, 3, 6, 13, 14, 15, 21, 22, 23, 25, 27, 30, 36, 66, 71, 74, 75, 77, 94, 96, 101, 103, 105, 108, 118, 120, 127, 142, 160, 168, 169, 171, 173, 175, 189, 197, 196, 198, 199, 200, 217, 220, 223, 233, 248, 253, 254, 256, 257, 261, 268, 273, 275, 277, 278, 279, 280, 281, 284, 286, 288, 292, 296, 297, 298, 299, 312, 313, 318, 336, 338, 341, 344, 348, 355, 364, 380, 381, 389, 390, 404, 405, 407, 411, 415, 418, 419, 422, 423, 425, 426, 1225, 477, 478, 481, 483, 484, 486, 489, 726, 496, 523, 525, 528, 531, 533, 534, 535, 536, 537, 538, 539, 540, 542, 544, 545, 547, 548, 550, 551, 552, 553, 556, 557, 558, 560, 561, 564, 570, 579, 581, 586, 597, 598, 599, 601, 602, 603, 604, 605, 606, 609, 610, 611, 612, 616, 617, 619, 620, 621, 627, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 645, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 659, 660, 661, 663, 664, 676, 677, 680, 682, 685, 689, 696, 699, 700, 702, 704, 705, 706, 707, 710, 718, 723, 1242, 727, 728, 730, 732, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 752, 753, 754, 756, 757, 758, 759, 760, 761, 762, 763, 765, 1504, 771, 772, 780, 782, 785, 786, 800, 802, 814, 827, 815, 826, 831, 837, 838, 839, 840, 841, 842, 843, 847, 850, 853, 857, 858, 859, 860, 861, 862, 865, 866, 867, 868, 869, 870, 871, 872, 874, 875, 876, 877, 878, 880, 881, 882, 883, 884, 885, 886, 887, 889, 890, 894, 899, 905, 907, 908, 913, 921, 922, 923, 925, 926, 928, 930, 931, 933, 936, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 949, 950, 951, 954, 955, 2068, 962, 963, 964, 967, 969, 970, 973, 975, 976, 977, 986, 994, 997, 998, 999, 1000, 1003, 1005, 1006, 1008, 1010, 1013, 1014, 1016, 1017, 1023, 1025, 1027, 1029, 1031, 1038, 1045, 1046, 1048, 1049, 1057, 1058, 1060, 1061, 1062, 1068, 1074, 1080, 1084, 1091, 1146, 1092, 1093, 1094, 1095, 1096, 1117, 1124, 1132, 1140, 1143, 1147, 1152, 1154, 1155, 1156, 1157, 1160, 1164, 1199, 1168, 1170, 1171, 1173, 1176, 1177, 1178, 1179, 1181, 1182, 1183, 1185, 1188, 1189, 1191, 1193, 1202, 1205, 1210, 1211, 1212, 1213, 1214, 1215, 1216, 1217, 1219, 1223, 1224, 1226, 1228, 1230, 1232, 1233, 1234, 1235, 1236, 1237, 1239, 1240, 1241, 1243, 1244, 1246, 1247, 1249, 1250, 1251, 1252, 1253, 1254, 1258, 1260, 1262, 1263, 1267, 1274, 1276, 1277, 1336, 1280, 1282, 1284, 1285, 1307, 1308, 1317, 1318, 1319, 1320, 1321, 1322, 1324, 1325, 1328, 1338, 1343, 1348, 1349, 1351, 1352, 1356, 1358, 1359, 1360, 1361, 1364, 1365, 1366, 1370, 1373, 1375, 1376, 1377, 1378, 1383, 1384, 1386, 1387, 1388, 1390, 1392, 1393, 1396, 1398, 1399, 1401, 1403, 1412, 1423, 1424, 1425, 1427, 1429, 1435, 1443, 1446, 1447, 1448, 1449, 1455, 1458, 1461, 1466, 1468, 1470, 1471, 1472, 1473, 1474, 1475, 1476, 1478, 1479, 1480, 1481, 1483, 1486, 1488, 1489, 1490, 1491, 1493, 1495, 1497, 1500, 1502, 1503, 1507, 1509, 1511, 1512, 1514, 1515, 1516, 1517, 1518, 1519, 1520, 1521, 1522, 1523, 1525, 1526, 1528, 1529, 1530, 1531, 1532, 1533, 1534, 1535, 1536, 1537, 1538, 1539, 1540, 1542, 1543, 1544, 1545, 1546, 1547, 1548, 1549, 1550, 1551, 1553, 1555, 1556, 1558, 1560, 1561, 1562, 1563, 1564, 1565, 1566, 1567, 1568, 1569, 1570, 1571, 1572, 1573, 1574, 1575, 1576, 1577, 1578, 1580, 1581, 1582, 1583, 1584, 1585, 1586, 1587, 1588, 1589, 1590, 1591, 1592, 1594, 1596, 1599, 1600, 1601, 1602, 1603, 1604, 1606, 1609, 1610, 1611, 1612, 1613, 1614, 1615, 1616, 1621, 1623, 1625, 1627, 1629, 1639, 1644, 1647, 1648, 1649, 1650, 1652, 1655, 1657, 1658, 1659, 1660, 1661, 1662, 1663, 1664, 1665, 1668, 1672, 1675, 1677, 1678, 1679, 1680, 1681, 1682, 1683, 1685, 1686, 1687, 1688, 1689, 1690, 1691, 1694, 1692, 1695, 1698, 1699, 1700, 1708, 1709, 1710, 1712, 1713, 1714, 1725, 1717, 1723, 1726, 1727, 1729, 1730, 1731, 1732, 1733, 1734, 1735, 1736, 1737, 1738, 1739, 1740, 1741, 1742, 1743, 1744, 1745, 1748, 1751, 1754, 1755, 1756, 1757, 1759, 1760, 1761, 1762, 1766, 1768, 1769, 1771, 1774, 1776, 1779, 1780, 1781, 1782, 1786, 1791, 1794, 1795, 1797, 1799, 1801, 1805, 1807, 1808, 1809, 1810, 1811, 1812, 1813, 1814, 1815, 1819, 1821, 1828, 1829, 1832, 1833, 1834, 1835, 1836, 1837, 1838, 1839, 1840, 1841, 1842, 1843, 1844, 1845, 1847, 1848, 1850, 1855, 1852, 1854, 1856, 1857, 1858, 1859, 1861, 1862, 1863, 1864, 1866, 1867, 1868, 1870, 1873, 1874, 1875, 1877, 1878, 1879, 1880, 1881, 1882, 1883, 1887, 1888, 1891, 1893, 1894, 1899, 1901, 1905, 1910, 1923, 1926, 1933, 1934, 1936, 1938, 1940, 1941, 1945, 1946, 1947, 1948, 1949, 1950, 1967, 1953, 1954, 1956, 1957, 1958, 1959, 1966, 1963, 1965, 1968, 1969, 1970, 1972, 1973, 1974, 1976, 1977, 1979, 1982, 1983, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1994, 1995, 1996, 1997, 1998, 1999, 2051, 2004, 2006, 2008, 2009, 2041, 2013, 2018, 2019, 2021, 2022, 2023, 2025, 2026, 2032, 2033, 2034, 2035, 2037, 2038, 2042, 2045, 2046, 2049, 2050, 2052, 2055, 2056, 2057, 2063, 2066, 2067, 2071, 2072, 2073, 2074, 2075, 2076, 2077, 2078, 2079, 2080, 2081, 2082, 2085, 2086, 2087, 2089, 2091, 2092, 2094, 2096, 2097, 2098, 2099, 2100, 2102, 2106, 2105, 2107, 2109, 2110, 2111, 2113, 2117, 2118, 2120, 2126, 2128, 2129, 2130, 2131, 2132
AND t.topic_poster = u.user_id^M
AND f.forum_id = t.forum_id ^M
AND p.post_id = t.topic_first_post_id^M
AND p2.post_id = t.topic_last_post_id^M
AND u2.user_id = p2.poster_id ORDER BY p2.post_time DESC LIMIT 0, 25;



This is normal ?

or this
# Time: 070430 20:39:42
# User@Host: root[root] @ localhost []
# Query_time: 29 Lock_time: 0 Rows_sent: 24513 Rows_examined: 26624
use phpbb1;
SELECT m.post_id ^M
FROM phpbb_search_wordlist w, phpbb_search_wordmatch m ^M
WHERE w.word_text LIKE'he%' ^M
AND m.word_id = w.word_id ^M
AND w.word_common <> 1;



PS
Can u write Rebuild Search mod for this mod? I mean rebuild only this tables which is need'ed it will be faster, and maybe better.

User avatar
drathbun
Former Team Member
Posts: 12204
Joined: Thu Jun 06, 2002 3:51 pm
Location: TOPICS_TABLE
Contact:

Re: [DEV] mysql fulltext search

Post by drathbun » Mon Apr 30, 2007 9:10 pm

There is no "rebuild search" MOD required for this, everything is managed via the database indexing process.
I blog about phpBB: phpBBDoctor blog
Still using phpbb2? So am I! Click below for details
Image

arod-1
Registered User
Posts: 1327
Joined: Mon Sep 20, 2004 1:33 pm

Re: [DEV] mysql fulltext search

Post by arod-1 » Mon Apr 30, 2007 10:29 pm

well, it is always possible that something went bad. in this case, i believe it is best to try backup first, and then REPAIR TABLE. this should, supposedly, rebuild all indices.
i wouldn't recommend writing a MOD to do this.
standard disclaimer:
backup your db and files before you do anything.
absolutely no guarantee.
if you do what i advise and it blows in your face, all you'll hear from me is: "ah... sorry, i guess"

User avatar
EverettB
Registered User
Posts: 326
Joined: Fri Aug 01, 2003 7:11 pm
Location: North America

Re: [DEV] mysql fulltext search

Post by EverettB » Mon Apr 30, 2007 10:44 pm

qspypl wrote: This is normal ?
# Time: 070430 20:39:42
# User@Host: root[root] @ localhost []
# Query_time: 29 Lock_time: 0 Rows_sent: 24513 Rows_examined: 26624
use phpbb1;
SELECT m.post_id ^M
FROM phpbb_search_wordlist w, phpbb_search_wordmatch m ^M
WHERE w.word_text LIKE'he%' ^M
AND m.word_id = w.word_id ^M
AND w.word_common <> 1;

The above query is normal once your search tables get to a certain size. This query is also the #1 slow query on my site.

I did remove "AND w.word_common <> 1" from the query though as I found it a useless addition.

User avatar
qspypl
Registered User
Posts: 149
Joined: Wed Dec 20, 2006 7:15 pm

Re: [DEV] mysql fulltext search

Post by qspypl » Tue May 01, 2007 9:26 pm

I think AND w.word_common <> 1; will not do much, coz its smth like

yes = 1;

if ( yes == 0 ) {
do something;
}

It just skips it... which is almost no time at all.

Ok here are results of this mod installed... server works slowly like always, this must be something other than search then.

UPDATE 21.05.2007
I advice to install this mod, coz you will never have problem with not actual phpbb_search tables.
Second thing is database two times smaller, which is too very good.

I must configure this mod to begin search from 3 characters not 4... and it will be perfect.

arod-1
Registered User
Posts: 1327
Joined: Mon Sep 20, 2004 1:33 pm

HOWTO: Change minimum searchable word length

Post by arod-1 » Tue May 22, 2007 11:06 pm

I was asked how to change the minimum searchable word length, which defaults to 4 characters.
This is not a phpbb limitation, it's a MySQL configurable parameter. phpbb queries mysql and reports the limitation, but it can't be changed from within phpbb (at least not to my knowledge).
In addition, I mentioned that MySQL uses a bizarre set of stop-words, and i would recommend to create your own, or remove the use of stop-words entirely.


You do this by editing MySQL configuration file, and restarting mysql server.
The default configuration file for MySQL in linux is /etc/my.cnf
in windows it's usually either C:\my.cnf or C:\Windows\my.ini (at least according to the docs).
in the configuration file, look for the header
[mysqld]
under it, add the following line:
ft_min_word_len=2
(or 3, or any other value you choose).

if you want to disable the use of stop-words (these are the "unsearchable" words), add this line:
ft_stopword_file=''
alternatively, you can point to a file which contains your own list of stop-words

after saving the configuration file you will need to restart mysql server.

Note: FULLTEXT indexes must be rebuilt after changing these variables. Use REPAIR TABLE tbl_name QUICK


FOR FURTHER READING:
http://dev.mysql.com/doc/refman/4.1/en/ ... ables.html (mysql 4 and 4.1)
http://dev.mysql.com/doc/refman/5.0/en/ ... ables.html (mysql 5)
standard disclaimer:
backup your db and files before you do anything.
absolutely no guarantee.
if you do what i advise and it blows in your face, all you'll hear from me is: "ah... sorry, i guess"

User avatar
qspypl
Registered User
Posts: 149
Joined: Wed Dec 20, 2006 7:15 pm

Re: [DEV] mysql fulltext search

Post by qspypl » Sun May 27, 2007 6:21 pm

thanks much! :), next question is this... will u make this mod for phpbb3 ? ; ) i think it will not be much diffirent than for phpbb2.

arod-1
Registered User
Posts: 1327
Joined: Mon Sep 20, 2004 1:33 pm

Re: [DEV] mysql fulltext search

Post by arod-1 » Mon May 28, 2007 6:10 pm

qspypl wrote:thanks much! :), next question is this... will u make this mod for phpbb3 ? ; ) i think it will not be much diffirent than for phpbb2.
well, phpbb3 already supports mysql fulltext, so there seem to no sense in creating this mod for phpbb3.
however, the way phpbb3 supports mysql fulltext search is seriously flawed.
specifically, phpbb3 uses 3 (!) indexes: post subject, post text, and post subject + post text.
this means that it takes phpbb3 at least twice as long to create or modify a new post than it should take.
also, there is no search option for "post subject", only for "topic title", but phpbb3 indexes post subject rather than topic title.

in short: if i was to create a mod for phpbb3, it would be a slight change to the way phpbb3 already supports mysql fulltext, rather than creating it (like this mod) from scratch.
also, i rather hope that the developers will get around to fix the way it is done now, rather than MODing something which should really be considered bug fix.

see: http://www.phpbb.com/bugs/phpbb3/ticket ... et_id=9851 , http://www.phpbb.com/bugs/phpbb3/ticket ... et_id=9818

it is unfortunate that naderman chose to ignore those issues, but i still hope they will be fixed once they realize it's a real problem. (see: http://www.phpbb.com/bugs/phpbb3/ticket ... t_id=11737 )
standard disclaimer:
backup your db and files before you do anything.
absolutely no guarantee.
if you do what i advise and it blows in your face, all you'll hear from me is: "ah... sorry, i guess"

User avatar
qspypl
Registered User
Posts: 149
Joined: Wed Dec 20, 2006 7:15 pm

Re: [DEV] mysql fulltext search

Post by qspypl » Tue May 29, 2007 3:13 pm

U have right too bad he closed it and didn't investigate, but very big advantage of your mod is 0 size of search tables... and 0 time in modyfing search tables while edit/post/delete... it is two big advantages. Phpbb3 may use 1/2/3/4/5 indexes but it still use search tables and slowing posting and gives need for rebuild search tables every time when moving forum to another server or just make the backups two times smaller.

Maybe i miss something... but what is better in this search tables and 3 indexes :P... pointless i think

arod-1
Registered User
Posts: 1327
Joined: Mon Sep 20, 2004 1:33 pm

Re: [DEV] mysql fulltext search

Post by arod-1 » Tue May 29, 2007 6:37 pm

qspypl wrote:Phpbb3 may use 1/2/3/4/5 indexes but it still use search tables and slowing posting and gives need for rebuild search tables every time when moving forum to another server or just make the backups two times smaller.

Maybe i miss something...
yes you do (i marked in red the things i think you missed).
if you set phpbb3 to use "fulltext mysql" as its search method, it does not maintain/use the notorious search tables.
if you do that (ie., switch search method) after the board have already began its life, you will also want to empty those tables (not necessary), but the tables will not be updated or used any more.

the thing i criticized in my earlier post was the fact that when selecting "fulltext mysql" search mode in phpbb3, it creates more FULLTEXT indexes than it really should, which does slow down posting and editing, but it does not cost anything in terms of backup file size.
standard disclaimer:
backup your db and files before you do anything.
absolutely no guarantee.
if you do what i advise and it blows in your face, all you'll hear from me is: "ah... sorry, i guess"

Post Reply

Return to “[2.0.x] MODs in Development”

Who is online

Users browsing this forum: No registered users and 11 guests