[2.0.x] Tweaks for large forums

The 2.0.x discussion forum has been locked; this will remain read-only. The 3.0.x discussion forum has been renamed phpBB Discussion.
lanzer
Registered User
Posts: 152
Joined: Wed Oct 10, 2001 10:00 am
Contact:

Post by lanzer » Sat Nov 08, 2003 2:41 pm

Thanks to the faster database server I was recently able to troubleshoot the forum as the database had been hammered under heavy but not show stopping loads. From a series of investigations and optimizations I finally ironed out almost every single slow query in phpbb, and now the entire forum is running really smooth even with more than 2700 users on-line. :D

Work had been really busy and it'll be hard to lay out exactly which queries and what changes I made during that time, but I did take notes on the changes, and here is a summary of the queries that I went through to avoid any instance of creating a tmp table. I believe that if phpBB has all these queries streamlined, it can be up there to be one of the enterprise grade forum software that will not break a sweat as the post count goes beyond millions of posts. :)

Memberlist.php
-- Instead of the select statement that selects all users info, an index of user_id is first built. The hack is very similar to the view topic hack on page 1

Viewforums.php
-- Same case, a list of topic id is built before grabbing all the topic information in the big select statment with all the table joins
-- Similar to the viewtopic hack, an index with two keys (forum id and post time) is build to speed up the search when building the list of topic id index

Posting.php
-- There is a statement there to grab the most recent messages within the thread, it needs to have the same hack for viewtopic.php applied to it
-- I might be wrong, but the statement for grabbing vote id and vote description does not require the GROUP BY statement. Taking it away sped up the query from 1 minute to 1 second

Privatemsg.php
-- Every statement involving the OR statement had to be seperated and joined with UNION. Similar to the following:

Code: Select all

BEFORE:

Query:	SELECT COUNT(*) AS total
	FROM bb_privmsgs WHERE ( ( privmsgs_to_userid = 3 AND privmsgs_type = 3 )
	OR ( privmsgs_from_userid = 3 AND privmsgs_type = 4) )

Elapsed time: 60.373462080956

After:
Query:	(SELECT COUNT(*) AS total
	FROM bb_privmsgs  WHERE privmsgs_to_userid = 3 AND privmsgs_type = 3 ) UNION (SELECT COUNT(*) AS total
	FROM bb_privmsgs  WHERE privmsgs_from_userid = 3 AND privmsgs_type = 4 ) 

Elapsed time: 0.085253000259399

Another example:

Query:	(SELECT pm.privmsgs_type, pm.privmsgs_id, pm.privmsgs_date, pm.privmsgs_subject, u.user_id, u.username
	FROM bb_privmsgs pm, bb_users u WHERE u.user_id = pm.privmsgs_from_userid AND pm.privmsgs_to_userid = 3 AND pm.privmsgs_type = 3 ) UNION (SELECT pm.privmsgs_type, pm.privmsgs_id, pm.privmsgs_date, pm.privmsgs_subject, u.user_id, u.username
	FROM bb_privmsgs pm, bb_users u WHERE u.user_id = pm.privmsgs_from_userid AND pm.privmsgs_from_userid = 3 AND pm.privmsgs_type = 4 )  ORDER BY pm.privmsgs_date DESC LIMIT 0, 40

Elapsed time: 0.54375398159027
The statements used to find oldest private messages, statements with count(*) needs to be handled differently since the query now yields 3 rows of data instead of one. Funny enough, that allows me to find out which private message status the oldest message is in, and a faster query could be made for deleting the oldest private message.

I have a feeling that the private messages table can probably afford one more entry to define which mailbox it is in so that we can have much faster queries that wouldn't even require the UNION statements. That's something left for the back burner.

Another change that was made was that a lot of the rather static data such as themes, forum permissions, bb_code and category/forum listing (for making jumpbox) were all cached as a php file and automatically updated every few minutes (or hours, for stuff like the themes). The basic concept is to grab the data from the database as uusal, then performing serialize to make the data into a string, and storing it into a two dimentional array with a timestamp. The result is stored into a php file and will then be included on the next execution. An example is listed below:

Code: Select all

      @include("includes/cache_smilies.php");
      if ( empty($savedata['time']) || $savedata['time'] < ($board_config['time_now'] - 3600) ) {
          $cache_update = TRUE;
          $smilies = array();
      } else {
          $smilies = $savedata['smilies'];
      }

      if ( empty($smilies) ) {
          $sql = 'SELECT SQL_CACHE code, smile_url FROM ' . SMILIES_TABLE;
          if( !$result = $db->sql_query($sql) )
          {
             message_die(GENERAL_ERROR, "Couldn't obtain smilies data", "", __LINE__, __FILE__, $sql);
          }
          $smilies = $db->sql_fetchrowset($result);
      }

      if (isset($cache_update)) {
          $savedata = array();
          $savedata['time'] = $board_config['time_now'];
          $savedata['smilies'] = $smilies;
          $save_string = addslashes(serialize($savedata));
          $file_text = '<? $savedata = unserialize(stripslashes(\''.$save_string.'\')); ?>';
          $handle = fopen ("includes/cache_smilies.php", "w");
          fwrite($handle,$file_text);
          fclose($handle);
      }
This is what I use to cache all my smilies. As smilies are called upon on every message view, having it cached save me from sending tons of queries to the database server. It's especially nice for the jumpbox because the data cannot be cached by the MySQL server (running MySQL version 4+) since the forum table is updated constantly.

Instead of the cache blindly being updated every x number of minutes, I can probably give a "bb_code last update" field to board_config so that the cache only updates when someone makes changes to the themes table. Should also port this code to a class when I have time.

Sorry for the crudeness of the information. I'll try to find time to better list all the changes once I find some free time. ^^;

Cheers,

User avatar
SHS`
Former Team Member
Posts: 6615
Joined: Wed Jul 04, 2001 9:13 am
Location: Yellow Beach, Nine Dragons, Hong Kong
Name: Jonathan Stanley
Contact:

Post by SHS` » Sat Nov 08, 2003 3:18 pm

lanzer wrote: Thanks to the faster database server I was recently able to troubleshoot the forum as the database had been hammered under heavy but not show stopping loads. From a series of investigations and optimizations I finally ironed out almost every single slow query in phpbb, and now the entire forum is running really smooth even with more than 2700 users on-line. :D


I assume that explains why the fora wasn't being pruned at about 4 million posts and why it's grown to over 8 million? Is it going to keep growing? :mrgreen:
Jonathan “SHS`” Stanley • 史德信
Image

lanzer
Registered User
Posts: 152
Joined: Wed Oct 10, 2001 10:00 am
Contact:

Post by lanzer » Sat Nov 08, 2003 9:20 pm

SHS` wrote:
lanzer wrote:Thanks to the faster database server I was recently able to troubleshoot the forum as the database had been hammered under heavy but not show stopping loads. From a series of investigations and optimizations I finally ironed out almost every single slow query in phpbb, and now the entire forum is running really smooth even with more than 2700 users on-line. :D


I assume that explains why the fora wasn't being pruned at about 4 million posts and why it's grown to over 8 million? Is it going to keep growing? :mrgreen:


Actually the changes mentioned on this page were only applied a few days ago. Before that the only hack that was made was the viewtopic hack, but that was good enough at that time to let the board run at reasonable speed.

One interesting thing about the slower queries is that the slower responses kind of acts as a rate limiter. We were pulling about 10 Meg per second before the overhaul, and yesterday we were at 16 Megs due to the new speed. 8O It's kind of like a curse in a blessing. ^^;

Here's the bandwidth usage graph if you're interested

We've been purning messages left and right, but we just can't prune messages that are in the role playing threads, so that's little we can do... It's just going to keep growing. :o I'm imagining if we never pruned, we'd be at 12 million posts by now. That's just scary.

The fun never ends!

aharris
Registered User
Posts: 9
Joined: Fri Sep 19, 2003 10:18 am
Location: London, UK
Contact:

sooper-stoopid question about searching medium size forums

Post by aharris » Thu Nov 13, 2003 2:26 pm

Hi Lanzer - firstly thanks for the great work you've put in to help the rest of us get to grips with some of these issues. :D I am addressing you as 'de facto guru' but if anyone else has input on my q's, would love to hear it...

I'm most way through setting up my new phpbb forum and am migrating a few thousand posts from a previous type of forum. I've been working like a dog to customize my forum and have as a result learned a bit of php, sql etc though am no expert. I understand the concept of indices, and have a reasonable understanding of how the whole phpbb search function works.

As a background assumption, I am running the forum on a host (Claranet in the UK) with a pretty good spec in terms of hardware etc (I assume this is true as they have won various awards etc) - my questions are:

1) Ever since I have started working with phpBB (which in general I love) I noticed that the search function, in my opinion, has a major defect :cry:, and I later found out why. I am referring to the fact that it is impossible (as far as I can find) to do an 'exact phrase'-type search, such as you might use in Google - ie to find "phpbb forum" rather than phpbb+forum. Looking at the sort of volumes which you get on your impressive board I can understand why the restrictive methodology used by phpbb comes into its own.

However, I personally am convinced that not to be able to do an exact phrase search is a major no-no and am considering re-writing the search function to allow it - in other words it wouldn't use the search word indices. Although there would be some (probably fairly major) penalty in performance, I think users would be prepared to wait an extra 5 seconds for example to get the right result. After all, a fast search is only a good thing if it helps you find what you are looking for. Now with the volumes which you have this would almost certainly be unthinkable and people would have lived, paid taxes and died by the time the results were returned... However, the 'natural size' for my forum is unlikely to exceed 100,000 posts at any one time. The question: how insane would I be to try this out in your opinion? Has it a hope of working satisfactorily? No doubt it would work in the early stages, but I don't want to be running into major problems at say 50,000 posts...

2) have you heard of anyone who has tried to mod or hack the search function in the way I describe?

:D Thanks in advance for any wisdom you can share with me on this one... I don't want to spend ages working on something that it is a dead duck at the start.

lanzer
Registered User
Posts: 152
Joined: Wed Oct 10, 2001 10:00 am
Contact:

Re: sooper-stoopid question about searching medium size foru

Post by lanzer » Fri Nov 21, 2003 1:17 am

Hello Aharris:

The search feature had always been a very interesting subject by itself. It's useful, especially for sites that has to deal with exchanging information than a conversational based forum like ours. At the same time searching is the most resource exhausting feature. The search feature is something left in the back burner for me, but I've always been thinking of ways to enhance it for the sheer fact that I'm interested in the subject.

The phrase match feature is entirely possible in theory. The first step to approaching the feature would be done by using the existing search class to find all the posts that contain every word in the search phrase. Once the post_id's have been collected, then ask MySQL to find the posts that matches the phrase entered.

The catch is that you don't want such a task to be handled on the main database server. You would want a second server to perform the task of searching. As the difference between accessing RAM and accessing swap space is heaven and hell, a forum just cannot allow single queries that can take more than a couple of seconds to complete. That's speaking in terms of an active forum where the database is handling more than a hundred queries per second. (wild guess) I think if the forum is not that active, such a feature can be feasible without the need of a dedicated search database. In fact, text search should just take up a lot of CPU time and shouldn’t have the need to swap, so it shouldn’t be all that bad. I wouldn’t know as I’ve never done it before.

But before I go on any furthur, may I suggest that you allow google to crawl through your site, and just change your search feature to allow a google search, then forward the search phrase to google as:

Code: Select all

"<search phrase>" site:<your-site-name.com>
It's the pain free way to get phrase searches. :) For a live example, visit www.mandrakelinux.com and use their search feature on the top right corner of their page.

To actually implement your suggested feature into phpBB, you will want the search function to find any double quotes before stripping them away, and save the phrase as a separate variable before the search begin. After the search is complete, intercept the routine while you have the list of post_id's that contains all the keywords. Now run a query similar to the extent of

Code: Select all

$sql = "SELECT post_id from phpbb_posts_text WHERE post_id in ($list_of_post_id_comma_seperated) AND post_text like '%$search_phrase%'";
and you'll be left with posts which matches the phrase as entered. In practicality you can't control what the user decides to search as a phrase, so you'll need to force the search class to limit your search if the search words are too common and end up matching a few thousand posts or much more. It all depends on how active your forum is for you to determine that threshold. If the phrase search feature is too resource intensive, then just lower the threshold to only scan through the latest 1000 matches for the search phrase requested, etc. The implementation is easy and it allows a fallback to avoid clogging up the server, so it should be something worth trying.

I've just ordered a second database server this week. Which will allow me to push all the slow queries (primarily search functions) to the second server. I can actually try the practicality of having phrase searches without compromising the speed of the site later when I have some time ^_^.

User avatar
shekinahs
Registered User
Posts: 81
Joined: Sat Oct 18, 2003 5:50 am
Location: USA

Post by shekinahs » Fri Nov 21, 2003 4:13 am

jk1 wrote: I think he was for real, here is his forum: http://ian.go-gaia.com/forum/index.php


I had to go to the site just to believe it. 143,190 registered users 8O

2,779 ONLINE :!: That's a whole lot of folks. How many moderators does it take to manage a board that size? I still can't believe those numbers. Good goobly.
~Everything is in the eye of the beholder. ~

aharris
Registered User
Posts: 9
Joined: Fri Sep 19, 2003 10:18 am
Location: London, UK
Contact:

sooper-stoopid question about searching medium size foru

Post by aharris » Fri Nov 21, 2003 10:09 pm

Hi Lanzer, thanks for the input - I'm going to keep it all in mind as I whack through my current iteration and on file for later as well. My forum isn't huge yet, and the search facility is absolutely critical to it so I'm trying a sloppy interim hack but keeping the info you've given me in mind as I go.

Since for the foreseeable future most people's searches are not likely to return more than a few hundred matches, I have implemented the phrase matching capability using just php within the search results which sql returns using the native phpbb search mechanism - in other words, a 'second pass' - phpbb does it's search, then I do a bit of mucking about with parsing the search phrase and building a mini boolean engine in php to remove those results from the array which aren't 'real' results (in terms of taking phrase matching into account) - it seems to be working very well at the moment, but I'm not finished.

Absolutely vital for me has been the disabling of the common words filter - my visitors will want to be able to search on them, common or not - my experience is the most important words on the forum tend to get marked as common - makes sense but completely the reverse of what I need.

Anyway, thanks again for your help - I look forward to having the sort of scalability problems you have to grapple with! :wink:

aharris
Registered User
Posts: 9
Joined: Fri Sep 19, 2003 10:18 am
Location: London, UK
Contact:

sooper-stoopid question about searching medium size foru

Post by aharris » Sat Nov 22, 2003 3:00 am

==============
Hi anyone reading, ref anyone looking at my hack below - it has been heavily edited since I orignally posted, but I am now happy it works well. It allows you to support full-phrase searching (vital on my forum) in conjunction with all the boolean functionality. There is an extra bit not printed here which also turns every search word into a wildcard search on both sides (i.e. looking for 'time' becomes in the code *time* to match sometime, times, timely etc) - to my understanding this is closer to a Google search which I am trying to simulate.


==============

Hi Lanzer, update on my eariler post - seems I spoke too soon without really knowing what was going on... :?

Anyhow, I clicked that because the phpbb generated search results are just post_ids I would have to do 2 queries - the first to get up the post texts of the initial results so that I could do some php-based phrase searching and whittle down the post_id results array and then proceed with the second one as it exists - to generate the final results set. I had to do this early enough in the script for all the housekeeping to get done, like making sure the search_results table was correctly updated.

Anyway, it's done now and seems to work very nicely. For anyone who wants to rip the script, here it is. It may need some final tweaking but broadly it seems to work properly... :D For some reason the tab indendations don't seem to show up in the post :(

First find the line in search.php which is:

$result_list = array();


INSERT the following after that line..



Code: Select all

// AHARRIS HACK STARTS HERE - it's only a hack on an 'as is' basis, but hope it helps someone - indents don't match cos whole section was copied from below

			
			//AH HACK PT 1 STARTS HERE - this top bit is making sure anything in quotes is ignored for a NOT
 
			$quotestore = array();
			
			if (strpos($search_keywords, '\\"') > -1)
			{
	
				if (substr_count($search_keywords, '\\"') % 2 != 0)
				{
					message_die('GENERAL_ERROR', '<B>There is an error in your search phrase..</B><BR><BR>Please check your quote marks ( " ) in your phrase search - there must <BR>be an even number of them, please click your back button to go back');
				}
	
				$offset = 0;
	
				$v = 0;
				
				while (strpos($search_keywords  , '\\"') > -1)
				{				
					{
						
						$startpos = strpos($search_keywords  , '\\"');
						$endpos = strpos($search_keywords  , '\\"', $startpos +2);

if ($startpos > -1 && $endpos > -1)
						{	
							$v++;
							$quotestore[] = substr($search_keywords, $startpos + 2, $endpos - $startpos - 2 ); 
		
							$search_keywords = substr_replace ($search_keywords , '||quot' . $v . '||', 
							$startpos, $endpos -$startpos + 2);
							
						}
						$startpos = -1;
						$endpos = -1;
					}
					
				}		
			}

$search_keywords = strtolower($search_keywords);
			
			$literalsearches = explode( ' ', $search_keywords );
	
			$split_search = array();
	
			$check = 1;
	
			for ($w = 0; $w < count($literalsearches); $w++)
			{
				if ($w > 0) //don't want it to test unless there could be a valid not before it, ie in pos 0
				{
					if (($w < count($literalsearches)) && (trim(strtolower($literalsearches[$w])) == 'not'))
					{
						if (stristr($literalsearches[$w + 1], '||quot') !== false)
						{
							//don't add it into phpbb's main search function (it'll exlcude all words not as a phrase)
							//omit this serach word if it's the not operator with a phrase after it
						}
						else
						{
							$addtoarr = true;
						}	
					}
	
					else if ((stristr($literalsearches[$w], '||quot') !== false) && (trim(strtolower($literalsearches[$w - 1])) == 'not'))
					{
	
						//don't add it into phpbb's main search function (it'll exlcude all words not as a phrase)
						//omit this search word if it's a phrase and the not operator precedes it		
					}
					else
					{
						$addtoarr = true;
					}
				}
		
				else
				{
					$addtoarr = true;
				}

if (trim($literalsearches[$w]) == '||quot' . $check . '||')
				{
				
					$literalsearches[$w] = $quotestore[$check-1];
					if (strlen($literalsearches[$w]) > 1)
					{
						$allowshortword = true;
					}
					$check++;
				}

$literalsearches[$w] = strtolower($literalsearches[$w]);
	
				if ($addtoarr === true)
				{
					$splitwordsarr = array();
	
					$splitwordsarr =( !strstr($multibyte_charset, $lang['ENCODING']) ) ?  split_words(clean_words('search', stripslashes($literalsearches[$w]), $stopword_array, $synonym_array), 'search') : split(' ', $literalsearches[$w]);
		
					for($mm = 0; $mm < count($splitwordsarr); $mm++)
					{
						if  ((strlen($splitwordsarr[$mm]) > 2) || ( $allowshortword == true))
						{
						
							$split_search[] = $splitwordsarr[$mm];	
							$allowshortword = false;
						}
					}
	
					$addtoarr = false;
				}

}

//AH HACK PT1 ENDS HERE
[/color]


now find the line further down which says:
$search_results = implode(', ', $search_ids);


and insert the following below it

Code: Select all

		// AH HACK PT 2 STARTS HERE 

		//
		// Look up data ...
		//
	
		if ( $search_results != '' & strpos($search_keywords   , '||quot') > -1) //we only do this if there is a phrase match
		{		
	
			if ( $show_results == 'posts' )
			{
				$postortopic = 'p.post_id';
			}
			else
			{
				$postortopic = 'p.topic_id';
			}
			
				$sql = "SELECT pt.post_text, pt.bbcode_uid, pt.post_subject, p.*, f.forum_id, f.forum_name, t.*, u.username, u.user_id, u.user_sig, u.user_sig_bbcode_uid  
					FROM " . FORUMS_TABLE . " f, " . TOPICS_TABLE . " t, " . USERS_TABLE . " u, " . POSTS_TABLE . " p, " . POSTS_TEXT_TABLE . " pt 
					WHERE $postortopic IN ($search_results)
						AND pt.post_id = p.post_id
						AND f.forum_id = p.forum_id
						AND p.topic_id = t.topic_id
						AND p.poster_id = u.user_id";

$per_page = ( $show_results == 'posts' ) ? $board_config['posts_per_page'] : $board_config['topics_per_page'];
	
			$sql .= " ORDER BY ";
			switch ( $sort_by )
			{
				case 1:
					$sql .= ( $show_results == 'posts' ) ? 'pt.post_subject' : 't.topic_title';
					break;
				case 2:
					$sql .= 't.topic_title';
					break;
				case 3:
					$sql .= 'u.username';
					break;
				case 4:
					$sql .= 'f.forum_id';
					break;
				default:
					$sql .= 'p.post_time';
					break;
			}

$sql .= " $sort_dir LIMIT $start, " . $total_match_count; 
	
			if ( $show_results == 'posts' )
			{
				$sql2 .= " ORDER BY p.topic_id LIMIT $start, " . $total_match_count; 
			}
			
			//we want all results returned for matching
	
			if ( !$result = $db->sql_query($sql) )
			{
				message_die(GENERAL_ERROR, 'Could not obtain search results', '', __LINE__, __FILE__, $sql);
			}
	
			$searchset = array();
			while( $row = $db->sql_fetchrow($result) )
			{
				$searchset[] = $row;
			}
			
			$db->sql_freeresult($result);	

$search_ids = array();

for($z = 0; $z < count($searchset); $z++)
			{
					
				$stringtosearch = $searchset[$z]['post_subject'] . ' ' . $searchset[$z]['post_text'];

$keepthisresult = false;
	
				for ($t = 0; $t < count($literalsearches); $t++)
				{
					
					if (trim($literalsearches[$t]) == 'or'  || trim($literalsearches[$t]) == 'and'  ||  trim($literalsearches[$t]) == 'not' 
						|| trim($literalsearches[$t]) == '')
					{

//ignore this search word as its an operator
					}
	
					else if ((stristr($stringtosearch, trim($literalsearches[$t])) !== false) &&
						($lastsearchphrasekept == true) &&
						(trim($literalsearches[$t-1]) == 'and'))
					{	
						
						if (count($literalsearches) > $t)
						{
							if (trim($literalsearches[$t+1]) != 'and')
							{
								$keepthisresult = true;	
							}
						}
						else
						{
							$keepthisresult = true;	
						}
	
						$lastsearchphrasekept = true;
						// it's a match, the last one was a match and there is an AND
					}
	
					else if ((stristr($stringtosearch, trim($literalsearches[$t])) !== false) &&
						($lastsearchphrasekept == false) &&
						(trim($literalsearches[$t-1]) == 'and'))
					{				
						
						$keepthisresult = false;	
						$lastsearchphrasekept = false;
						// it's a match, the last one was a match and there is an AND
					}
	
					else if ((stristr($stringtosearch, trim($literalsearches[$t])) === false) &&
						($keepthisresult == false) &&
						(trim($literalsearches[$t-1]) == 'and'))
					{				
						
						$keepthisresult = false;
						$lastsearchphrasekept = false;
						// it's NOT a match, there's no prior match, and we have NOT fulfilled an AND condition
					}

else if ((stristr($stringtosearch, trim($literalsearches[$t])) !== false) &&
						(trim($literalsearches[$t-1]) == 'not'))
					{				
						
						$keepthisresult = false;
						$lastsearchphrasekept = false;	
						break;
						// it's matches on a NOT condition whicih overrides everything so break out
					}
	
					else if ((stristr($stringtosearch, trim($literalsearches[$t])) !== false))
					{				

if (count($literalsearches) > $t)
						{
							if (trim($literalsearches[$t+1]) != 'and')
							{
								$keepthisresult = true;	
							}
						}
						else
						{
							$keepthisresult = true;	
						}			
	
						$lastsearchphrasekept = true;
						// other conditions tested, it's a match so keep it in
					}
	
					else if ((stristr($stringtosearch, trim($literalsearches[$t])) === false))
					{
						//it's not a match has no impact on keepthisresult
						
						$lastsearchphrasekept = false;	
									
					} 
					else
					{
						//catch all for de-bugging
					}

}
	
				if ($keepthisresult === true)
				{
					
					if ( $show_results == 'posts' )
					{
						$search_ids[] = $searchset[$z]['post_id'];
					}
					
					else
					{
						$topicid = array_search($searchset[$z]['topic_id'], $search_ids);
						if ( $topicid === false)
						{
							$search_ids[] = $searchset[$z]['topic_id'];
						}
					}
				}	
	
			}

$total_match_count = count($search_ids);
		
			$search_results = implode(', ', $search_ids);

}
	
		//AH HACK PT2 ENDS HERE
		
//AHARRIS hack ends here[/color]
Last edited by aharris on Mon Dec 01, 2003 3:24 pm, edited 2 times in total.

SocomVengeance.com
Registered User
Posts: 27
Joined: Wed Oct 29, 2003 1:04 am
Location: Maryland
Contact:

Post by SocomVengeance.com » Sun Nov 23, 2003 2:53 am

Yeah his site loads slow for me basicky cuz of the images.

lanzer
Registered User
Posts: 152
Joined: Wed Oct 10, 2001 10:00 am
Contact:

Post by lanzer » Mon Nov 24, 2003 7:46 am

shekinahs wrote: 2,779 ONLINE :!: That's a whole lot of folks. How many moderators does it take to manage a board that size? I still can't believe those numbers. Good goobly.


We have a little less than 30 moderators right now. We add new moderators to our team every month by word of mouth from our existing moderators. We then invite any candidates that has no objecting votes. With our growth rate we're definitely understaffed. Wish there's more I can do but I only have one pair of hands. :)

lanzer
Registered User
Posts: 152
Joined: Wed Oct 10, 2001 10:00 am
Contact:

Gaia's my.cnf file

Post by lanzer » Mon Nov 24, 2003 11:27 am

A user recently PM'ed me asking for my MySQL config file, so I figured I might as well list it here and include some remarks which might be useful for others:

Code: Select all

# for use in go-gaia.com
#
# This config file is derived from my-huge.cnf, with inno-db enabled and
# some tweaks for a database that composes of a hundred or so MyISAM tables
# and 40 or so InnoDB tables.
#
# With one hard drive for Linux, one for holding MyISAM tables and one for InnoDB tables
#

[client]
port		= 3306
socket		= /var/lib/mysql/mysql.sock

# The MySQL server
[mysqld]

# log-slow-queries=/home/mysql/slow.log

## Needed when using raw partition for InnoDB
# user=root

## This resides on the hard drive which holds MyISAM tables
datadir=/var/lib/mysql

set-variable  = autocommit=0
port            = 3306
socket          = /var/lib/mysql/mysql.sock
skip-locking

## Wordmatch table alone has a 150M index, the rest added together is about 250M
set-variable    = key_buffer=300M
set-variable    = max_allowed_packet=1M

## table cache is large because Gaia has quite a bit of MyISAM tables
set-variable    = table_cache=5000
set-variable    = sort_buffer=1M
set-variable    = join_buffer=128K
set-variable    = net_buffer_length=16K
set-variable    = myisam_sort_buffer_size=6M
# log-bin
server-id       = 1
set-variable = max_connections=800
set-variable = max_user_connections=800

## query cache is rather over-rated.  As most tables are updated frequently, only few queries can benefit from it.
## queries that can benefit from query cache are now cached locally on the web server size already
set-variable = query_cache_size=4M
set-variable = query_cache_type=2
set-variable = wait_timeout=120

set-variable = thread_concurrency=4

## thread cache size is important, be sure to set it
set-variable = thread_cache_size=300

# Point the following paths to different dedicated disks
tmpdir          = /home/mysql/temp/

## Be sure to use the "noatime" and "nodiratime" parameters on /etc/fstab
innodb_data_home_dir = /mnt/raid/
innodb_data_file_path = ibdata1:2000M;ibdata2:2000M;ibdata3:500M:autoextend

## This was used when raw partition was being used.  There used to be a performance increase when hardware RAID was not setup
# innodb_data_home_dir =
# innodb_data_file_path = /dev/sdb1:20Graw

innodb_log_group_home_dir = /home/mysql/
innodb_log_arch_dir = /home/mysql/

## Database server has 4G of RAM
set-variable = innodb_buffer_pool_size=1800M
set-variable = innodb_additional_mem_pool_size=20M

# Set .._log_file_size to 25 % of buffer pool size
set-variable = innodb_log_file_size=150M
set-variable = innodb_log_buffer_size=8M

## Still experimenting with the flush method, seeing no difference in performance
# innodb_flush_method=O_DSYNC

## This is a must once you had finished tweaking the settings
innodb_flush_log_at_trx_commit=0

## Not necessary but recommended
set-variable = innodb_lock_wait_timeout=5
set-variable = max_connect_errors=10000

lanzer
Registered User
Posts: 152
Joined: Wed Oct 10, 2001 10:00 am
Contact:

Avoid losing message when session expired

Post by lanzer » Mon Nov 24, 2003 11:49 am

By default, user is forwarded to the login screen if the user's session expired (1hour+) during a long post. It can be quite fustrating as the post data is lost during the forwarding process.

Ideally, if the login process is a class or function, then posting.php can be modified to log the user in while the post data is preserved. But anyways, as a quick hack to posting.php, what I've done is to replace:

Code: Select all

        redirect(append_sid("login.$phpEx?redirect=posting.$phpEx&" . $redirect, true));
with:

Code: Select all

    $l_link = '<A HREF="'. append_sid("login.$phpEx") .'" TARGET="_blank">here</A>';
    $close_login_box = TRUE;
    $error_msg = "This forum requires you to log in before posting.  Please click $l_link to login, then re-submit your post.<BR>";
This way the user will be able to open up a new window to login and have the original post retained. We have a login box in the header which we want to disable, and that's what the close_login_box variable does. A second note is that the posting.php file does not show an error message on session_id mis-match. Meaning the following code is disabled:

Code: Select all

              message_die(GENERAL_MESSAGE, 'Invalid_session');
This change was made after being fustrated at losing an hour long message. I suppose most users have auto-login anyways. :roll: (being an admin I don't let myself use such features)

PS - Private messaging can be modified to use this function also, with the addition of setting the $error variable to TRUE.

R45
Registered User
Posts: 2830
Joined: Tue Nov 27, 2001 10:42 pm

Post by R45 » Wed Nov 26, 2003 2:57 pm

For those with PostgreSQL, in functions_search.php for the remove_search_post() function, I highly suggest modifying the code so it doesn't use the 3 level Sub select. This query took about 30 seconds to execute on a heavy traffic board I run. Instead, use the MySQL query, where the words are select in a separate query and deleted in another. To do this, just add case 'postgresql': under the case 'mysql': and case 'mysql4':

Code: Select all

DELETE FROM phpbb_search_wordlist
				WHERE word_id IN ( 
					SELECT word_id 
					FROM phpbb_search_wordmatch 
					WHERE word_id IN ( 
						SELECT word_id 
						FROM phpbb_search_wordmatch 
						WHERE post_id IN (58636) 
						GROUP BY word_id 
					) 
					GROUP BY word_id 
					HAVING COUNT(word_id) = 1
				)

Execution Time: 30.915371 seconds
is now

Code: Select all

SELECT word_id 
					FROM phpbb_search_wordmatch 
					WHERE word_id IN (883, 1310, 1602, 2700, 2852, 3264, 4560, 6167, 11976, 12196, 19516, 21602, 24747, 25418) 
					GROUP BY word_id 
					HAVING COUNT(word_id) = 1


Execution Time: 8.063049 seconds

Query:

DELETE FROM phpbb_search_wordmatch  
		WHERE post_id IN (58636)


Execution Time: 0.009335 seconds
Still slow, but much better that 30 seconds. Sub selects are nice and all, but not always the best for performance. I wouldn't like to have to imagine that query before I pruned the forums completely, when there were 1,100,000 posts 8O

InFamous
Registered User
Posts: 2
Joined: Wed Nov 26, 2003 10:00 pm

Post by InFamous » Wed Nov 26, 2003 10:06 pm

Hi, I am InFamous from go-gaia.com. can I become a mod there? I am very helpful

arhodes16
Former Team Member
Posts: 1034
Joined: Sun Mar 10, 2002 5:28 pm

Post by arhodes16 » Wed Nov 26, 2003 11:06 pm

InFamous wrote: Hi, I am InFamous from go-gaia.com. can I become a mod there? I am very helpful

Asking there would be the sensible thing to do - this has nothing to do with these forums or this topic. Please refrain from posting in this topic unless you have something useful to add.

Locked

Return to “2.0.x Discussion”