[INFO] How gogle PHPBB!

A place for MOD Authors to post and receive feedback on MODs still in development. No MODs within this forum should be used within a live environment! No new topics are allowed in this forum.
Forum rules
READ: phpBB.com Board-Wide Rules and Regulations

IMPORTANT: MOD Development Forum rules

On February 1, 2009 this forum will be set to read only as part of retiring of phpBB2.
R. U. Serious
Registered User
Posts: 830
Joined: Mon Feb 11, 2002 2:07 pm

Post by R. U. Serious »

Ok, I think I got a better idea. The change in page_header as mentioned above by netclectic will only work for counting the umber of online guests.

Instead the following change (hopefully) will stop sessions being even created for Googlebot. In sessions.php find

[cut] Wait, not a good idea. Let me think about this some more...
If it is getting no session at all (which was my idea at first), it will not index some pages, e.g. those with votes in it and if you have a vote on your index it won't see anything). It might have other side effects, too.
So I am thinking about another way...
Last edited by R. U. Serious on Sun Sep 01, 2002 9:57 pm, edited 1 time in total.
netclectic
Former Team Member
Posts: 4439
Joined: Wed Mar 13, 2002 3:08 pm
Location: Omnipresent
Contact:

Post by netclectic »

Cool! This looks more promising. 8)

Cheers!
Defend the game:
Image
fishfreek
Registered User
Posts: 695
Joined: Tue May 14, 2002 3:05 pm
Location: Virginia
Contact:

Post by fishfreek »

Ok, Now I am confused.

Whats the code I need to add and where to keep phpbb from recording each and every googlebot as a new guest?
Have you joined the PetAdvice Network?

Are you in or around the Shenandoah Valley area? If so ask me how to join our local Shenandoah Valley aquarium group.
netclectic
Former Team Member
Posts: 4439
Joined: Wed Mar 13, 2002 3:08 pm
Location: Omnipresent
Contact:

Post by netclectic »

R. U. Serious wrote: [cut] Wait, not a good idea. Let me think about this some more...
If it is getting no session at all (which was my idea at first), it will not index some pages, e.g. those with votes in it and if you have a vote on your index it won't see anything). It might have other side effects, too.
So I am thinking about another way...


I'm currently using the method you posted here without any apparent problems and it definitly seems to be the best answer to date, no hundred of entries in the sessions table for google, no spurious guests appearing in whos online and google appears to be indexing my site :D

Admittedly i dont have any polls on my index page. Have you discovered any othere side effects?

Anyway, i look forward to hearing what you come up with!
Defend the game:
Image
User avatar
hijacker
Registered User
Posts: 311
Joined: Sat Jan 05, 2002 2:45 pm
Location: Slovenia

Post by hijacker »

Very interested topic... But I can't find exactly what is needed to be done in order to let google index all the topics in the forum.

Can someone please sum it up in one single topic.

Thanks a lot!
icehousedesigns
Registered User
Posts: 52
Joined: Mon Mar 04, 2002 6:36 pm
Location: 127.0.0.1
Contact:

Post by icehousedesigns »

Put the code in the first post as stated in sessions.php...and save. Of course, Googlebot will only crawl the forum if it finds it. So make sure your forum is linked from your main page and across the rest of your site...and of course that will only work if Google indexes your site on a monthly basis as well.
netclectic
Former Team Member
Posts: 4439
Joined: Wed Mar 13, 2002 3:08 pm
Location: Omnipresent
Contact:

Post by netclectic »

To submit your site to google - http://www.google.com/addurl.html
Defend the game:
Image
User avatar
hijacker
Registered User
Posts: 311
Joined: Sat Jan 05, 2002 2:45 pm
Location: Slovenia

Post by hijacker »

Got it... Thanks... Will try it ASAP.
icehousedesigns
Registered User
Posts: 52
Joined: Mon Mar 04, 2002 6:36 pm
Location: 127.0.0.1
Contact:

Post by icehousedesigns »

Looks like your site is already indexed by Google, so just applying the patch to sessions.php should do the trick.

http://216.239.39.100/search?sourceid=n ... ash.com%2F
User avatar
TC
Former Team Member
Posts: 3633
Joined: Tue Sep 25, 2001 7:23 pm
Location: Kµlt °ƒ Ø, working on my time machine

Post by TC »

hi all -

ok, so it seems we have two or three people who have worked this out since this thread started. can any/all of you come up with some definitive documentation for this? PM me.
.:: 28:Ø6:42:12 ::.
User avatar
Techie-Micheal
Security Consultant
Posts: 19511
Joined: Sun Oct 14, 2001 12:11 am
Location: In your servers

Post by Techie-Micheal »

lars_msh wrote: OK thanks, I'll go along with that... and save the paranoia for another day! :D


lol. :D I was wondering that myself so you aren't alone. :)
Proven Offensive Security Expertise. OSCP - GXPN
R. U. Serious
Registered User
Posts: 830
Joined: Mon Feb 11, 2002 2:07 pm

Post by R. U. Serious »

TC wrote: hi all -

ok, so it seems we have two or three people who have worked this out since this thread started. can any/all of you come up with some definitive documentation for this? PM me.


Hi TC, to achieve what is posted in the first post of this thread, apply the mod I posted there (3rd post in this thread: http://www.phpbb.com/phpBB/viewtopic.ph ... 214#193214 )

If you further wish to restrict google to use only a single session, do the following. This will prevent google from appearing numerous times on the guest/who-is-online list (and record).

Code: Select all

################################################################# 
## MOD Title: GoogleSingleSession (Add-On to enhance-google-indexing )
## MOD Author: - R. U. Serious 
## MOD Description: This MOD will give all 'guests' where the useragent
##		    contains 'Googlebot' one session (static session_id)
##		    Hence it will only appear as a single guest.
##
## MOD Version: 0.9
## 
## Installation Level: (easy) 
## Installation Time: 5 Minutes 
## Files To Edit: includes/sessions.php  
############################################################## 
## For Security Purposes, Please Check: http://www.phpbb.com/mods/downloads/ for the 
## latest version of this MOD. Downloading this MOD from other sites could cause malicious code 
## to enter into your phpBB Forum. As such, phpBB will not offer support for MOD's not offered 
## in our MOD-Database, located at: http://www.phpbb.com/mods/downloads/ 
############################################################## 
## Author Notes: This is only an Add-ON. You will not notice anything with only this mod
##       installed. Please consider installing another MOD to enhance Google-indexing
##       http://www.phpbb.com/phpBB/viewtopic.php?p=193214#193214
##       ( enhance-google-indexing )
############################################################## 

#-----[ OPEN ]------------------------------------------ 
# 
includes/sessions.php 

# 
#-----[ FIND ]------------------------------------------ 
# 
$session_id = md5(uniqid($user_ip));

# 
#-----[ REPLACE WITH ]------------------------------------------ 
#
# Note: d8ef2eab is one of the googlecrawlbots ips
#
//$session_id = md5(uniqid($user_ip));
global $HTTP_SERVER_VARS;
$session_id = ( !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot') ) ? md5(uniqid($user_ip)) : md5(d8ef2eab);


# 
#-----[ FIND ]------------------------------------------ 
# 
	else
	{
		$sessiondata = '';
		$session_id = ( isset($HTTP_GET_VARS['sid']) ) ? $HTTP_GET_VARS['sid'] : '';
		$sessionmethod = SESSION_METHOD_GET;
	}


# 
#-----[ AFTER ADD ]------------------------------------------ 
#
	global $HTTP_SERVER_VARS;
	if ( empty($session_id)  && strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot') )
	{
		$sessiondata = '';
		$session_id = md5(d8ef2eab);
		$sessionmethod = SESSION_METHOD_GET;
	}


# 
#-----[ FIND ]------------------------------------------ 
# 

			if ( $ip_check_s == $ip_check_u ) 

# 
#-----[ REPLACE WITH ]------------------------------------------ 
#

	//		if ( $ip_check_s == $ip_check_u ) 
			if (( $ip_check_s == $ip_check_u ) || ($session_id == md5(d8ef2eab)&&(strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot'))))

# 
#-----[ SAVE/CLOSE ALL FILES ]------------------------------------------ 
# 
# EoM 
Last edited by R. U. Serious on Sun Oct 06, 2002 9:30 am, edited 1 time in total.
User avatar
TC
Former Team Member
Posts: 3633
Joined: Tue Sep 25, 2001 7:23 pm
Location: Kµlt °ƒ Ø, working on my time machine

Post by TC »

thank you. 8)
.:: 28:Ø6:42:12 ::.
chinch
Registered User
Posts: 169
Joined: Mon Mar 11, 2002 7:34 pm

Post by chinch »

great stuff here.

will sessions.php from 2.02 work with 2.00? I have not updated due to several hacks and the last bit of code is different. any advice appreciated... currently my phpBB is not found in google :(
davidh44
Registered User
Posts: 386
Joined: Sat Mar 09, 2002 5:56 am

Post by davidh44 »

If a forum has thousands of topics, wouldn't allowing Googlebot to index the pages cause a huge increase in bandwidth? I'm assuming the googlebot would go through every single topic and every single page of that topic (which could multiply the number of pages spidered quite dramatically).

Another concern is if Google might think a site with 10,000 pages might be fishy. For example, if you have a forum dedicated to a topic in which its keywords are what you want people to find you by through Google. And the keywords are used very frequently through all the posts in that forum. If you have 1000 pages worth all with those keywords, does Google treat them as internal "spam" pages to try to get your domain littered through Google when someone searches for that keyword? I've never done a search which turned up pages and pages of forum links from one site.
If your non-forum pages on the same domain already have high rankings, I don't know if you'd want to risk getting your site knocked down for some perceived violation by the google computer.

Then again, if other forum software don't have any session id issues blocking Google indexing, then I'm sure Google has figured out a way to handle it. I doubt they'd want to index large forums with hundreds of thousands of pages, many of which might dissappear at the next automatic pruning.
Locked

Return to “[2.0.x] MODs in Development”