[INFO] How gogle PHPBB!

A place for MOD Authors to post and receive feedback on MODs still in development. No MODs within this forum should be used within a live environment! No new topics are allowed in this forum.
Forum rules
READ: phpBB.com Board-Wide Rules and Regulations

IMPORTANT: MOD Development Forum rules

On February 1, 2009 this forum will be set to read only as part of retiring of phpBB2.
Locked
StalkR
Registered User
Posts: 17
Joined: Tue Jan 08, 2002 9:58 pm
Location: France
Contact:

Post by StalkR »

warf ! I just had a look at http://www.robotstxt.org/wc/active/all.txt
There's too many robots, and they're not correctly identified by their useragents or hosts... soooo I don't know how to start :s

I made a php script which retrieves all hosts & useragents & robot-ids from this page, but I don't know what to do, some of them contains * or X... should I use preg_match for some of them ? damn!

I know, I'll only do the script for few robots, the most common... this would be great for a first version of the mod :D

brb... 8)
http://stalkr.net & #StalkR on irc.stalkr.net
Quarterbore
Registered User
Posts: 81
Joined: Thu Sep 25, 2003 1:02 am

Post by Quarterbore »

StalkR wrote: ## MOD Version: 0.9.2
#-----[ OPEN ]------------------------------------------
includes/sessions.php

#-----[ FIND ]------------------------------------------
global $SID;

if ( !empty($SID) && !preg_match('#sid=#', $url) )

#-----[ REPLACE WITH ]------------------------------------------
global $SID, $HTTP_SERVER_VARS;

if ( !empty($SID) && !preg_match('#sid=#', $url) && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot') && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'slurp@inktomi.com;'))

#
#-----[ SAVE/CLOSE ALL FILES ]------------------------------------------
#
# EoM [/code]


OK, I have this and I made the change above and the board runs fine...
here is the new code for phpBB v2.0.11, it works on my forum :)
the thing to do was to replace

Code: Select all

!eregi('sid=', $url)
with the new string in includes/sessions.php of the phpBB version 2.0.11

Code: Select all

!preg_match('#sid=#', $url) )


Strangely enough, when I added this extra change my v2.0.11. Board stopped working? I guess I will see if the 1st change is enough to get the forum indexed?

Thanks
QB

EDIT: 24-Dec-2004

I have yet to see anything listed in google that is not indexed directly from another page on my website BUT I have had a google bot hanging out on my forum for a long time....
12 26 25 25 66.249.66.44 crawl-66-249-66-44.googlebot.com


I have upto 7 sessions for this one IP - BOT... Not that I am complaining or bragging... just happy to see that it looks like things can get indexed now...
StalkR
Registered User
Posts: 17
Joined: Tue Jan 08, 2002 9:58 pm
Location: France
Contact:

Post by StalkR »

I'm glad to see it working :)
If you want you can use the following mod to make google use only one session... I didn't tested it myself, but it should work ;)

Code: Select all

## MOD Title: GoogleSingleSession (Add-On to enhance-google-indexing ) 
http://stalkr.net & #StalkR on irc.stalkr.net
cerebri
Registered User
Posts: 75
Joined: Thu Sep 23, 2004 4:27 pm

Post by cerebri »

The orginal code for GoogleSingleSession wont work with v2.0.11 :(
At least i didnt got it to work...

If anyone did...please type how! :D
Zeena
Registered User
Posts: 13
Joined: Wed Nov 24, 2004 9:12 pm
Location: Norway
Contact:

Post by Zeena »

The forum runs just fine after I have done the ## MOD Version: 0.9.2 on 2.0.11

But how how can I see it works...
Best regards...
Jeroen_
Registered User
Posts: 7
Joined: Thu Dec 09, 2004 4:15 pm

Post by Jeroen_ »

My forum is already in Google after only 10 days. :D
LoreKeeper
Registered User
Posts: 1
Joined: Sat Jan 01, 2005 1:24 pm

Confused

Post by LoreKeeper »

What exactly is the change that one must make in sessions.php in the latest version of phpbb?

I have just installed version 2.0.11.

It seems tha the changes indicated on page 51 are far less than the ones indicated on the 1st page...
User avatar
lanesharon
Registered User
Posts: 400
Joined: Fri Dec 05, 2003 9:33 pm
Location: º• Confused! •º
Contact:

Re: Confused

Post by lanesharon »

LoreKeeper wrote: What exactly is the change that one must make in sessions.php in the latest version of phpbb?

I have just installed version 2.0.11.

It seems tha the changes indicated on page 51 are far less than the ones indicated on the 1st page...


Ditto
rycharde
Registered User
Posts: 17
Joined: Sun Nov 28, 2004 1:23 pm

Post by rycharde »

Hi
I've had google bots on my forum every time I'm working on it. But I have AdSense already installed so it is their adsense bot rather than the google search engine bot. Google site claims they are different but it seems to be doing a pretty good job of targeting the ad content to the page content.
Just thought I'd add my experience.

Rych
ace2ace
Registered User
Posts: 364
Joined: Sat Aug 14, 2004 3:48 pm

Post by ace2ace »

wow! I read the entire 52 pages. Lots of brilliant ideas. Congrat Folks!!! I love phpBB. thank you.

from U.R's MOD on page 3, we have to FIND the following line in sessions.ph. I am using version 2.011 and there is no $sessiondata=";

there are some statments around line 45, 223 and 358 that look like it. Which one should I pick? thanks.
#
#-----[ FIND ]------------------------------------------
#
else
{
$sessiondata = '';
$session_id = ( isset($HTTP_GET_VARS['sid']) ) ? $HTTP_GET_VARS['sid'] : '';
$sessionmethod = SESSION_METHOD_GET;
}


#
#-----[ AFTER ADD ]------------------------------------------
#
global $HTTP_SERVER_VARS;
if ( empty($session_id) && strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot') )
{
$sessiondata = '';
$session_id = md5(d8ef2eab);
$sessionmethod = SESSION_METHOD_GET;
}
john_r
Registered User
Posts: 19
Joined: Thu Nov 18, 2004 8:11 pm

Post by john_r »

Hi,

did the mod on 2_0_11 and worked fine with
Google Bot


Mod Used
#-----[ OPEN ]------------------------------------------
includes/sessions.php

#-----[ FIND ]------------------------------------------
global $SID;

if ( !empty($SID) && !preg_match('#sid=#', $url) )

#-----[ REPLACE WITH ]------------------------------------------
global $SID, $HTTP_SERVER_VARS;

if ( !empty($SID) && !preg_match('#sid=#', $url) && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot') && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'slurp@inktomi.com;'))

#
#-----[ SAVE/CLOSE ALL FILES ]------------------------------------------
#
# EoM


However, now being crawled by a google bot with the user agent
"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
and it receives an ID and does not get past "index.php"

Please, how do I modify the above, so that both can crawl. ?

Tried a few variations but no avail

Thanks
arca33
Registered User
Posts: 1
Joined: Sat Jan 29, 2005 10:40 pm
Contact:

Session with spider

Post by arca33 »

If any spider visit the forum, start a new session, this is incorrect because isn't a user.

For controll the session check HTTP_USER_AGENT
es.
in include/sessions
the function append_sid

chenge in

function append_sid($url, $non_html_amp = false)
{
global $SID;
if(strpos($_ENV[HTTP_USER_AGENT],"Mozilla")){
if ( !empty($SID) && !preg_match('#sid=#', $url) )
{
$url .= ( ( strpos($url, '?') != false ) ? ( ( $non_html_amp ) ? '&' : '&' ) : '?' ) . $SID;
}
}
return $url;
}
phpBB SEO
Registered User
Posts: 9
Joined: Sat Jan 29, 2005 8:51 pm
Contact:

Post by phpBB SEO »

cerebri wrote: But if google use a LOT more sessions? :)


Yes, I have had about 100 visits from Googlebot showing at one time. That was right after I put up mod_rewrite...
phpBB Search Engine Optimization (SEO) offers a mod for phpBB that can dramatically increase your search engine referrals. It includes SID removal, an archive, mod_rewrite, and multiple other small mods. To find out more about it, please visit our Features page or our Downloads page.
Frastraslafra
Registered User
Posts: 21
Joined: Wed Aug 06, 2003 5:18 am

Post by Frastraslafra »

Hi, i got a problem to disable SID on this user agent:
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
and one from M$

Today that bot visit my site 1000 times...half index.php half search.php with a lot of diferents sid...how can i disable the SID for that bot...i have this code that works with the other googlebot...

Code: Select all

{
	global $SID;

	if ( !empty($SID) && !preg_match('#sid=#', $url) 
     && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot') 
     && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot/2.1 (+http://www.googlebot.com/bot.html)') 
     && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot-Image/1.0 (+http://www.googlebot.com/bot.html)') 
     && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Mediapartners-Google/2.1 (+http://www.googlebot.com/bot.html)') 
     && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)')
     && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'slurp@inktomi.com;')
     && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)') 
     && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'FAST-WebCrawler') 
     && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'msnbot/1.0 (+http://search.msn.com/msnbot.htm)') 

   )
	{
		$url .= ( ( strpos($url, '?') != false ) ?  ( ( $non_html_amp ) ? '&' : '&' ) : '?' ) . $SID;
	}

	return $url;
}
thanks
Frastraslafra
Registered User
Posts: 21
Joined: Wed Aug 06, 2003 5:18 am

Post by Frastraslafra »

anyone knows? i feel im loosing a lot of pages that can be indexed in google

thanks again
Locked

Return to “[2.0.x] MODs in Development”