1.) Umm.... actually, it does. Via the UNION clause. I just thought I'd throw it in there.arod-1 wrote:just couple of small comments:
-- you mentioned "no fulltext search" as a limitation of mysql 5.1 split table feature. i sure hope your implementation does not try to still have the mysql fulltext feature enabled...
-- if one goes as far as to dissect the code to support partitioning, one might want to consider partitioning to several databases rather than just several tables in the same database - much more simple to do load-sharing among several server without the hassle of clusters.
-- you might want to give a second look to the outline i posted several posts ago about dividing the content between "live" and "archive" rather than symmetrical splitting of the posts table.
good luck with the project!
my point was/is that for a board large enough to warrant splitting the post table, mysql fulltext search is inadequate.Dog Cow wrote:1.) Umm.... actually, it does. Via the UNION clause. I just thought I'd throw it in there.
i agree 100%.Dog Cow wrote:Yep, I've heard those same things. Fortunately, I'm using the very latest version of Mysql 5.0.67 with a modern Mac (1GB Ram,Intel Core 2 Duo 1,83GHz) , so I can test a multi-gigabyte database fulltext index with the latest gear. Now of course, this testing is far from comprehensive, but I can still have a look.
Really, though, I think that a fulltext index, if you've researched it and believe it can work for your needs, should be at least considered or tried. Things have improved over the years. Otherwise, there are many other possibilities out there, such as having Google handle the search (which is what the 1.4 billion post forum does), or using the Sphinx system.
What we should notice in below query explain. We scanned every row from 'post_id IN' part. There are exactly 442 post_id's in brackets above, same value scanned which is# Query_time: 5 Lock_time: 0 Rows_sent: 16 Rows_examined: 442
SELECT pt.post_text, pt.bbcode_uid, pt.post_subject, p.*, f.forum_id, f.forum_name, t.*, u.username, u.user_id, u.user_rank, u.user_posts, u.user_sig, u.user_sig_bbcode_uid
FROM phpbb_forums f, phpbb_topics t, phpbb_users u, phpbb_posts p, phpbb_posts_text pt
WHERE p.post_id IN (41, 154, 374, 417, 442, 564, 696, 735, 774, 777, 820, 892, 899, 960, 1030, 1033, 1042, 1111, 1122, 1294, 1298, 1300, 1406, 2121, 2223, 2293, 2422, 2538, 3049, 3100, 3724, 3742, 4032, 4122, 4130, 4143, 4152, 4221, 4271, 4286, 4298, 4754, 4774, 4778, 4917, 4942, 5054, 5106, 5423, 5455, 5465, 5843, 5850, 6243, 6617, 6618, 6627, 6679, 6770, 6772, 7188, 7212, 7332, 7455, 7485, 7890, 7993, 7994, 7995, 7996, 7997, 7998, 7999, 8000, 8001, 8002, 8003, 8005, 8006, 8007, 8008, 8009, 8087, 8506, 8508, 8532, 8533, 8535, 8675, 8830, 8939, 8982, 9021, 9057, 9078, 9861, 10632, 10686, 10687, 10802, 11569, 11913, 11914, 12402, 13249, 13251, 14195, 14454, 14800, 14835, 14841, 15670, 17586, 17760, 19061, 19416, 20388, 20403, 20437, 20449, 20464, 20854, 21149, 21472, 21791, 22049, 22085, 22202, 22935, 23055, 23529, 23688, 24460, 24820, 25466, 25495, 25769, 25779, 26087, 26096, 26168, 26346, 26356, 27144, 27146, 27874, 27959, 28023, 28027, 28034, 28289, 28319, 28354, 28364, 28737, 29110, 29562, 29563, 29657, 30195, 30253, 31143, 31839, 32048, 32049, 32127, 32556, 32617, 32799, 32983, 33118, 33414, 33878, 34371, 34547, 34954, 35436, 35536, 35544, 35798, 36021, 36217, 37566, 38531, 38540, 39450, 39571, 39765, 39919, 40000, 40826, 40867, 40874, 41208, 41227, 41303, 41482, 41543, 42687, 43130, 43801, 44447, 44542, 44670, 44906, 46043, 46097, 46190, 46618, 46943, 47734, 48180, 49416, 49967, 50429, 52387, 54089, 54891, 55142, 55564, 56868, 58896, 60154, 61597, 61607, 61700, 63546, 63746, 63874, 64141, 65051, 65482, 65488, 67911, 68298, 68449, 68842, 71930, 72171, 73584, 74949, 75882, 76681, 76864, 77852, 79695, 81502, 82347, 82721, 86148, 86183, 86185, 86186, 86596, 86787, 88858, 89032, 89275, 90306, 91375, 91775, 92730, 94397, 94400, 95025, 95448, 99009, 99650, 102365, 103471, 103481, 105666, 111963, 114398, 117132, 118647, 118908, 123252, 124576, 128575, 128742, 132167, 133758, 135378, 138575, 142691, 143079, 146003, 146686, 153755, 158019, 158716, 168437, 168836, 170090, 170197, 170286, 172804, 175058, 175856, 176726, 179889, 184470, 185471, 186408, 186751, 188225, 191523, 192947, 197733, 198399, 201998, 203406, 207369, 208169, 210672, 211281, 211986, 213622, 214781, 216952, 217583, 220946, 224052, 225761, 229402, 231767, 234644, 236315, 236927, 243594, 246955, 248197, 250894, 255443, 260560, 260628, 276253, 280963, 297169, 301025, 305387, 307178, 307591, 307754, 308621, 311672, 314019, 315486, 323900, 332030, 337153, 341329, 348472, 348580, 361364, 363253, 365113, 365221, 371786, 377231, 394973, 412128, 413312, 417216, 418451, 420022, 429796, 431207, 438367, 448410, 456448, 458673, 458913, 462470, 467066, 474513, 475246, 476572, 487019, 488709, 489419, 490296, 490401, 494080, 495576, 499693, 506102, 512708, 521189, 522767, 528464, 533892, 536524, 545307, 549054, 549825, 549827, 553347, 558939, 566995, 575815, 578688, 584365, 586356, 586522, 589949, 590585, 604949, 609632, 609656, 609676, 611747, 612196, 612346, 612612, 615195, 618487, 622400, 639690, 657980, 670523, 675966, 692273, 706685, 707210, 714968, 722688, 749071, 760788, 763793, 771669, 775132, 775485, 781390, 789016, 790900, 801102, 830722, 858350, 860247, 862051)
AND pt.post_id = p.post_id
AND f.forum_id = p.forum_id
AND p.topic_id = t.topic_id
AND p.poster_id = u.user_id ORDER BY p.post_time DESC LIMIT 0, 16;
Anyone ideas how to optimize this? Fun starts when query looks post_id IN ( 5000 posts_id's ). Then its really slow on high load.id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t range PRIMARY,forum_id,topic_last_post_id,fid_type_lp PRIMARY 3 NULL 442 Using where; Using temporary; Using filesort
1 SIMPLE f eq_ref PRIMARY PRIMARY 2 t.forum_id 1
1 SIMPLE u eq_ref PRIMARY PRIMARY 3 t.topic_poster 1
1 SIMPLE p eq_ref PRIMARY PRIMARY 3 t.topic_first_post_id 1
1 SIMPLE p2 eq_ref PRIMARY,poster_id PRIMARY 3 t.topic_last_post_id 1
1 SIMPLE u2 eq_ref PRIMARY PRIMARY 3 p2.poster_id 1
btw. splitting huge tables to smaller ones (posts/posts_text) is still actual idea ( i mean sooner or later 'must do' ), it would speed up alter table, repair table, backup table, query'ing in some cases etc.if ( $show_results == 'posts' )
{
$sql = "SELECT pt.post_text, pt.bbcode_uid, pt.post_subject, p.*, f.forum_id, f.forum_name, t.*, u.username, u.user_id, u.user_rank, u.user_posts, u.user_sig, u.user_sig_bbcode_uid
FROM " . FORUMS_TABLE . " f, " . TOPICS_TABLE . " t, " . USERS_TABLE . " u, " . POSTS_TABLE . " p, " . POSTS_TEXT_TABLE . " pt
WHERE p.post_id IN ($search_results)
AND pt.post_id = p.post_id
AND f.forum_id = p.forum_id
AND p.topic_id = t.topic_id
AND p.poster_id = u.user_id";
//echo "test";
}
Idea of using NOT IN is kinda stupid to begin with .DELETE FROM phpbb_confirm
WHERE session_id NOT IN ('0004eaa18bc281b1507ab8dcef97a41f', '0006cb66d0ef9da9b56b4b9b8ebc6271', '0007f695b0dff6f8addb8b9d345c0566', '00080aec9ddbd4a5b893e4e374d2315c', '0018bdd7194402d5022f3ae9915904fa', '001d8048dfd67b80aa0fb0bcc0ac98ab', '0021bb62fe2fa670916370cbdb20861f', '0023a8bcfc884415a4705a6ffe180be4', '002f2da25216a732d935d08b6d4c46ae', '0034d4f79a78cc3985829380ace4afc0', '00370ffa197d8d8741c526da6565bc51', '003ce16e661307011e5932f9c5bafd85', '003d5bbdb7d2ab90d4968f0a34171787', '0046016b6991f029d474e76b969d6faa', '00572345f03102dc05bb3bde5600f135', '0058db93946f80f99fb02569b1f7c7fa', '005dc595202f9b38b5bba7e804912673', '006c8249cfa305d2ac86c8f08aca3491', '0072a8f271955f523a60fe6481bf4f95',
... lots of data like this here
'ffe062eb5733680f39178e71a948a718', 'fff35d1527fbfc18d2fcea1e7ffac5ff', 'fff7b7329d68de4539b833809c4222fd');
I just read recently that someone had made a modification to change that to count by time instead, since it relies on currently active sessions and thus gets unweildly on an active forum. Here's the topic: http://www.phpbb.com/community/viewtopi ... 5#p7257265qspypl wrote: update:
Notice in logs another strange query which took every time more than 3 seconds:
Idea of using NOT IN is kinda stupid to begin with .DELETE FROM phpbb_confirm
WHERE session_id NOT IN ('0004eaa18bc281b1507ab8dcef97a41f', '0006cb66d0ef9da9b56b4b9b8ebc6271', '0007f695b0dff6f8addb8b9d345c0566', '00080aec9ddbd4a5b893e4e374d2315c', '0018bdd7194402d5022f3ae9915904fa', '001d8048dfd67b80aa0fb0bcc0ac98ab', '0021bb62fe2fa670916370cbdb20861f', '0023a8bcfc884415a4705a6ffe180be4', '002f2da25216a732d935d08b6d4c46ae', '0034d4f79a78cc3985829380ace4afc0', '00370ffa197d8d8741c526da6565bc51', '003ce16e661307011e5932f9c5bafd85', '003d5bbdb7d2ab90d4968f0a34171787', '0046016b6991f029d474e76b969d6faa', '00572345f03102dc05bb3bde5600f135', '0058db93946f80f99fb02569b1f7c7fa', '005dc595202f9b38b5bba7e804912673', '006c8249cfa305d2ac86c8f08aca3491', '0072a8f271955f523a60fe6481bf4f95',
... lots of data like this here
'ffe062eb5733680f39178e71a948a718', 'fff35d1527fbfc18d2fcea1e7ffac5ff', 'fff7b7329d68de4539b833809c4222fd');
What file is this in?da_badtz_one wrote:There are some great comments getting thrown around here, thanks to Dog Cow especially for writing up that guide.
I'd like to share a tweak I made a while ago which I forgot to share.
The following query is a query you'll find when trying to find the total count of every user in your database to provide statistical information. To avoid having to rewrite the whole user counting process to feature what phpBB3 uses I used this nifty tweak I found.
So this query:
becomes:Code: Select all
$sql = 'SELECT COUNT(user_id) AS total FROM ' . USERS_TABLE . ' WHERE user_id > ' . ANONYMOUS;
The difference is noticeable as shown as the result:Code: Select all
$sql = 'SELECT COUNT(user_id) -1 AS total FROM ' . USERS_TABLE;
I still do not know the reason why it is faster, but if it returns the same result then hey why not?Code: Select all
mysql> SELECT COUNT(user_id) -1 AS total FROM driaw_users ; +--------+ | total | +--------+ | 452979 | +--------+ 1 row in set (0.38 sec) mysql> SELECT COUNT(user_id) AS total FROM driaw_users WHERE `user_id` > -1 ; +--------+ | total | +--------+ | 452979 | +--------+ 1 row in set (2.26 sec)
One of the things we learn from using COUNT is that it is slow when put against large tables. The best way to keep statistical information for anything we want to count is to keep a record of it as a number in the config table instead. However this requires accuracy in making sure that the count is kept consistent with the changes made. It will be initially hard to convert everything over to use this statistical count method but atleast there is some indication to tell you whether or not you continue to accurately make these count changes. And that is to compare this method to the old method and see if there are any differences.
For example, if I deleted a user, the user count will decrease by 1.
includes/functions.phpJLA wrote:What file is this in?da_badtz_one wrote:
The following query is a query you'll find when trying to find the total count of every user in your database to provide statistical information. To avoid having to rewrite the whole user counting process to feature what phpBB3 uses I used this nifty tweak I found.
Thank YouDog Cow wrote:includes/functions.phpJLA wrote:What file is this in?da_badtz_one wrote:
The following query is a query you'll find when trying to find the total count of every user in your database to provide statistical information. To avoid having to rewrite the whole user counting process to feature what phpBB3 uses I used this nifty tweak I found.
Right at the top of the file (or very near the top).
Does anyone know where Lanzer actually put the code for this?Memberlist.php
-- Instead of the select statement that selects all users info, an index of user_id is first built. The hack is very similar to the view topic hack on page 1
Viewforums.php
-- Same case, a list of topic id is built before grabbing all the topic information in the big select statment with all the table joins
-- Similar to the viewtopic hack, an index with two keys (forum id and post time) is build to speed up the search when building the list of topic id index
Posting.php
-- There is a statement there to grab the most recent messages within the thread, it needs to have the same hack for viewtopic.php applied to it
-- I might be wrong, but the statement for grabbing vote id and vote description does not require the GROUP BY statement. Taking it away sped up the query from 1 minute to 1 second
ThanksSorry for the crudeness of the information. I'll try to find time to better list all the changes once I find some free time. ^^;