[INFO] How gogle PHPBB!

A place for MOD Authors to post and receive feedback on MODs still in development. No MODs within this forum should be used within a live environment! No new topics are allowed in this forum.
Forum rules
READ: phpBB.com Board-Wide Rules and Regulations

IMPORTANT: MOD Development Forum rules

On February 1, 2009 this forum will be set to read only as part of retiring of phpBB2.
Locked
Acid Paul
Registered User
Posts: 60
Joined: Sat May 31, 2003 7:26 am
Location: Belarus
Contact:

Post by Acid Paul »

2 rasputinj: as for my idea:
3. !!! think of a solution to link to topics with multiple pages - with this mod the urls look like ptopic47.html&start=60; could we make it into ptopic47-60.html and make universal across the board? (because if you open up a topic the urls will be longer - covering &postorder, &postday - then maybe use the format like ptopic47-60-0-0.html)


I just found that nukecops handles it the way we want, not as it is in the present mod here, so there should be some more code in the mod. Orientalgate.org also has a bit different solution, but they all lack consistency: like from the folder forum page 1 of topics looks like forrtopic288-0.html, and from the second page - fortopic288-0-asc-10.html. And this seems *duplicate* for search engine. Both nukecops and orientalgate have this problem, though they assign different URIs for such multiple pages.

So right now I'm looking at orientalgate - I know Russian. Will get their mod and play with it. Was it #42?

2 BurnArt: well, I have not installed this mod and can't really help. I hope netclectic or RUS will help, however. For now, check out to have no spaces at line ends and in case you have the latest version of PHPBB - ceck whether the mod addresses the same var/function names as in the new versions (if you've installed mod1 with eregi/preg_match issue, you'll get my drift). Just my 2 c.
Acidics.com - dissolving online scams and hoaxes
Go pour some acid on fraudsters at Forum.acidics.com
User avatar
rasputinj
Registered User
Posts: 46
Joined: Fri Apr 11, 2003 10:19 pm
Contact:

Post by rasputinj »

Thanks for looking the code Paul, I am not a programmer, I just fiddled with the code enough to make it work, I do like what nuke cops has done also, I would like to expand the rewrite on the .html and make it consistent. This was just my test to make sure it would work..Have you look at their code for the sessions.php where they use a function to tell if it is a bot. looks pretty good, it does not work in its current form though.
quentin
Registered User
Posts: 197
Joined: Tue May 20, 2003 7:30 am
Location: Geneva, Switzerland
Contact:

Post by quentin »

to rasputinj,
i have seen your robots list, and as i mentionned in anotherfriendly urls rewrite thread, i really think it's a waste.
MOST of these robots, that is all apart maybe from googlebot, slurp (inktomi), fast, and altavista won't go further than 2 links down, would the urls be friendly or not. Most of them won't ever bring you a single visitor.
In the meantime, having a 100+ robots list regexped for every request is a real pain for your cpu, and it makes the user experience slower for all those who have already found your site. You'll finally lose some of your visitors (or not win new ones) because you wanted your site indexed in www.the-search-engine-nobody-ever-uses-anyway.com.

I would highly suggest to anyone willing to make his board SE friendly to focus on a few bots (considering that 5 SEs with bots hold 99.99% of the market) if he does care at least a little bit about cpu usage and page generation times.

just my two cents.
Quentin
The largest message boards on the web !
Web Design Library (coming soon)
Friends sites: Heroes of might and magic - Biometric security
Acid Paul
Registered User
Posts: 60
Joined: Sat May 31, 2003 7:26 am
Location: Belarus
Contact:

Post by Acid Paul »

I have modified the Googlifier mod for PHPNuke with PHPBB2 port found at www.orientalgate.org to work with standalone PHPBB2. This mod allows you to turn dynamic forum urls into static ones and therefore get better indexed and ranked by Google & co., as well as picked by search engines, that don't like dynamic URLs (e.g. AllTheWeb).

The effect is like this:

> viewtopic.php?t=157 => ftopic28.html

(affects all topics, posts, forum folders and pagination; member profiles, search, lorgin and usergroups URL stay unchanged; the main page remains index.php)

Requirements: you need mod_rewrite enabled as well as ability to use .htaccess or modify apache config files.

Implementation:

Step 1. In /includes/page_header.php before

Code: Select all

//
// Generate logged in/logged out status
//
add this code (make sure there are no space breaks at line ends after you paste):

Code: Select all

ob_start(); 
function replace_for_mod_rewrite(&$s) 
{ 
$urlin = 
array(
"'(?<!/)viewforum.php\?f=([0-9]*)&topicdays=([0-9]*)&start=([0-9]*)'",
"'(?<!/)viewforum.php\?f=([0-9]*)&mark=topics'",
"'(?<!/)viewforum.php\?f=([0-9]*)'",
"'(?<!/)viewtopic.php\?t=([0-9]*)&view=previous'",
"'(?<!/)viewtopic.php\?t=([0-9]*)&view=next'",
"'(?<!/)viewtopic.php\?t=([0-9]*)&postdays=([0-9]*)&postorder=([a-zA-Z]*)&start=([0-9]*)'",
"'(?<!/)viewtopic.php\?t=([0-9]*)&start=([0-9]*)&postdays=([0-9]*)&postorder=([a-zA-Z]*)&highlight=([a-zA-Z0-9]*)'",
"'(?<!/)viewtopic.php\?t=([0-9]*)&start=([0-9]*)'",
"'(?<!/)viewtopic.php\?t=([0-9]*)'",
"'(?<!/)viewtopic.php&p=([0-9]*)'",
"'(?<!/)viewtopic.php\?p=([0-9]*)'",
);
$urlout = array(
"viewforum\\1-\\2-\\3.html",
"forum\\1.html",
"forum\\1.html",
"ptopic\\1.html",
"ntopic\\1.html",
"ftopic\\1-\\2-\\3-\\4.html",
"ftopic\\1.html",
"ftopic\\1-\\2.html",
"ftopic\\1.html",
"sutra\\1.html",
"sutra\\1.html",
);
$s = preg_replace($urlin, $urlout, $s);
return $s;
}
Step 2. In /includes/page_tail.php after

Code: Select all

$db->sql_close();
add this:

Code: Select all

$contents = ob_get_contents();
ob_end_clean();
echo replace_for_mod_rewrite($contents);
global $dbg_starttime;
in the same file after

Code: Select all

ob_end_clean();
add this:

Code: Select all

echo replace_for_mod_rewrite($contents);
global $dbg_starttime;
Step 3. In your .htaccess file (if you don't have one create it; should be located in you forum root directory) paste these lines:

Code: Select all

RewriteEngine On
RewriteRule ^forums.* index.php
RewriteRule ^forum([0-9]*).* viewforum.php?f=$1&mark=topic
RewriteRule ^viewforum([0-9]*)-([0-9]*)-([0-9]*).* viewforum.php?f=$1&topicdays=$2&start=$3
RewriteRule ^forum([0-9]*).* viewforum.php?f=$1
RewriteRule ^ptopic([0-9]*).* viewtopic.php?t=$1&view=previous
RewriteRule ^ntopic([0-9]*).* viewtopic.php?t=$1&view=next
RewriteRule ^ftopic([0-9]*)-([0-9]*)-([a-zA-Z]*)-([0-9]*).* viewtopic.php?t=$1&postdays=$2&postorder=$3&start=$4
RewriteRule ^ftopic([0-9]*)-([0-9]*).* viewtopic.php?t=$1&start=$2
RewriteRule ^ftopic([0-9]*).* viewtopic.php?t=$1
RewriteRule ^ftopic([0-9]*).html viewtopic.php?t=$1&start=$2&postdays=$3&postorder=$4&highlight=$5
RewriteRule ^sutra([0-9]*).* viewtopic.php?p=$1
Note: sometimes this .htaccess code won't work properly if the forum is on a subdomain (was the case with me), you may need to try this variation:

Code: Select all

RewriteEngine On
RewriteRule ^forums.* /index.php
RewriteRule ^forum([0-9]*).* /viewforum.php?f=$1&mark=topic
RewriteRule ^viewforum([0-9]*)-([0-9]*)-([0-9]*).* /viewforum.php?f=$1&topicdays=$2&start=$3
RewriteRule ^forum([0-9]*).* /viewforum.php?f=$1
RewriteRule ^ptopic([0-9]*).* /viewtopic.php?t=$1&view=previous
RewriteRule ^ntopic([0-9]*).* /viewtopic.php?t=$1&view=next
RewriteRule ^ftopic([0-9]*)-([0-9]*)-([a-zA-Z]*)-([0-9]*).* /viewtopic.php?t=$1&postdays=$2&postorder=$3&start=$4
RewriteRule ^ftopic([0-9]*)-([0-9]*).* /viewtopic.php?t=$1&start=$2
RewriteRule ^ftopic([0-9]*).* /viewtopic.php?t=$1
RewriteRule ^ftopic([0-9]*).html /viewtopic.php?t=$1&start=$2&postdays=$3&postorder=$4&highlight=$5
RewriteRule ^sutra([0-9]*).* /viewtopic.php?p=$1
Step 4. Very important! In your robots.txt file (goes at the *site* root) add these lines:

Code: Select all

Disallow: /your-forum-folder/sutra*.html$
Disallow: /your-forum-folder/ptopic*.html$
Disallow: /your-forum-folder/ntopic*.html$
Disallow: /your-forum-folder/ftopic*asc*.html$
(This is required to avoid feeding duplicate content to Google)

Step 5. Apply Google Mod #1 described at this page.

Congrats! Now you now have traffic-generating mogul.
Last edited by Acid Paul on Sun Jul 06, 2003 5:28 pm, edited 3 times in total.
Acidics.com - dissolving online scams and hoaxes
Go pour some acid on fraudsters at Forum.acidics.com
Ixtlan
Registered User
Posts: 26
Joined: Wed Mar 19, 2003 10:28 pm

Googlefier Mod

Post by Ixtlan »

Acid Paul wrote: The effect is like this:

> viewtopic.php?t=157 => ftopic28.html

Does this mean that if there are already existing www hyperlinks (in the '.../...viewtopic.php?t=157' format) on other webpages leading to articles in my phpbb forum (of course there are) they won't work any more after applying your Googlefier Mod as described?
you need mod_rewrite enabled

Sorry I don't understand this.
Step 5. Apply Google Mod #1 described at this page.

I've done that before. Is it not enough? What exactly are the benefits in applying your Mod?

Sorry for those lots of questions. Thx in advance for clarification.
Acid Paul
Registered User
Posts: 60
Joined: Sat May 31, 2003 7:26 am
Location: Belarus
Contact:

Post by Acid Paul »

Here you go:

1. the links in the form of viewtopic.php?t=157 will still work; the mod above is cosmetic - it doesn't prevent forum scripts from understanding old-type URLs. If you have links on other pages in that format you'd better ask webmasters to change them to the new ones. Otherwise Google may think you have duplicate content on your sites. If you have really many links to your forum and already enjoy good rankings in Google you may not need to use this mod.

2. mod_rewrite is a feature of Apache server. Run phpinfo(); or directly ask your webhost whether it has been enabled. If not, insist they enable it.

3. if you've already applied the no-SID Google mod, you don't need to reapply it again. However you may add Fast-Webcrawler to it, as this spider will gladly pick up new .html-aliased pages.

The benefits of applying my mod are that Google will think you forums/topics are html pages => it will index you more actively, plus it is common knowledge that Google assigns a bit higher rankings to html pages than php ones and is more forgiving in terms of code purity (.php should better be xhtml, .html - not necessarily).
Acidics.com - dissolving online scams and hoaxes
Go pour some acid on fraudsters at Forum.acidics.com
R. U. Serious
Registered User
Posts: 830
Joined: Mon Feb 11, 2002 2:07 pm

Post by R. U. Serious »

Acid Paul wrote: 2. mod_rewrite is a feature of Apache server. Run phpinfo(); or directly ask your webhost whether it has been enabled. If not, insist they enable it.


*G* A lot of hosters won't, because you can thoroughly screw up your server (with infinite loops etc.). However I agree that you should try to talk with them.
The benefits of applying my mod are that Google will think you forums/topics are html pages => it will index you more actively, plus it is common knowledge that Google assigns a bit higher rankings to html pages than php ones and is more forgiving in terms of code purity (.php should better be xhtml, .html - not necessarily).


Now IMHO that is complete guess work. I don't even remotely think that it's true. From my own experience google make no difference on "file endings" in the url. Doesn't matter wether it's .html or .php or .xyz. Ranking is absolutely not determined from that. Just plain folklore.
I have sites (.php with parameters) that rank high, get well spidered ( last I checked 13.000 pages), are german (!) language and get more than 1000 referrals a day from google, and PR is 4 to 5 at most (down to 2 and 3 on inner pages).

Now I agree that it may help for other crawlers / searche engines. And I agree that the URLs do look cleaner. But it will not help your ranking in google. IMO and from my experience.
Acid Paul
Registered User
Posts: 60
Joined: Sat May 31, 2003 7:26 am
Location: Belarus
Contact:

Post by Acid Paul »

Well, basically assuming that dynamic pages are ranked higher is of course wrong. I have to agree with R.U. Serious. Code matters, not file extensions.

It began autumn 2002 that Google started being more active with dynamic sites - if earlier 15-20% indexed pages was good for them, now it is close to 90%, but .html is always close to 100% (dpending on PR and interlinking structure). So Google still picks up .html more gladly than .php

If you have just started your PHPBB board this mod will let Google et al spider you very fast indeed, plus a smooth way to AllTheWeb. If your board has been on for half a year and is doing good in SERPs, you'd better not apply this mode.
Acidics.com - dissolving online scams and hoaxes
Go pour some acid on fraudsters at Forum.acidics.com
Acid Paul
Registered User
Posts: 60
Joined: Sat May 31, 2003 7:26 am
Location: Belarus
Contact:

Post by Acid Paul »

Hey, developing the idea - with the mod_rewrite mod (or hsim's path_info mod) Google will pick up threads and forum folders fatster, pages will mature in Google's database and get a slight boost in rankings (when site matures, it does get a certain slight boost; learned it from my own experience) compared to completely the same pages found through URL with "?". Although it's a slight boost, rankings are an aggreagtion of such slight boosts.
Acidics.com - dissolving online scams and hoaxes
Go pour some acid on fraudsters at Forum.acidics.com
R. U. Serious
Registered User
Posts: 830
Joined: Mon Feb 11, 2002 2:07 pm

Post by R. U. Serious »

It certainly cannot harm doing what you proposed. And the people with the knowledge, time on their hands etc. should go ahaed. But fact is you can achieve oustanding rankings and referrals without having to go through that trouble.

I would advise people who are not so technically avid, to simply install the google-session-Mod and spend the rest of their time writing quality content and exchainging links with quality sites. Your time will be a lot better spent, and your hair won't turn grey. In the end you will get great traffic to your sites through either way. (I agree that doing both & all might be a tad bit better, but I assume people still have an offline-live... ;-) )
ROAMER
Registered User
Posts: 11
Joined: Sun Mar 23, 2003 5:45 am

Post by ROAMER »

Can we do a re-take here ..

Isn't this thread obsolete .. I seem to remember that it was said in the FIRST post, Google is happy and Ok with, and WILL index), DYNAMIC urls, but Google is not happy with SIDs.

In PhpBB 2.0.5 the SIDs have been removed, (too many problems with invalid sessions etc).

SIDs are gone,

This means there is no longer a problem since Google should now index everything.

Or am I missing something?

( I am assuming people will upgrade to 2.0.5).
Acid Paul
Registered User
Posts: 60
Joined: Sat May 31, 2003 7:26 am
Location: Belarus
Contact:

Post by Acid Paul »

Hm, I thought the mod would get a more warm welcome. So nobody except me needs it? Well, speaking of Google, discussing whether dynamic is worse than static is futile because there's no way to test two completely the same pages with different URLs. Personally, I hold of opinion that my forum will be spidered faster and more deeply with static-looking pages (will be moving my site to another webhost that allows mod_rewrite). Why folks at nukecops did the same mod for phpnuke (mine is actually only a copy of theirs), just to waste away time? And there're a couple of other important spiders besides Googlebot, who are lazy to deepspider dynamic sites... Ok, shluss damit. ... eh, then there's of course WebMasterWorld.com forum...

Hey, Roamer, I didn't know 2.0.5 made SIDs obsolete - that's superb. Upgrading.

Right now trying to optimize pagination - make all URLs to topics look universal across the board if the sorting rules are default.

P.S. as for the SE optimized template I'm currently working on - it required so much input that I'm afraid I won't share it, sorry; not for free, it's really required a lot of creativity. I'll just share some tips on what I did.
Acidics.com - dissolving online scams and hoaxes
Go pour some acid on fraudsters at Forum.acidics.com
quentin
Registered User
Posts: 197
Joined: Tue May 20, 2003 7:30 am
Location: Geneva, Switzerland
Contact:

Post by quentin »

Google actually can index dynamic urls, but sometimes chooses not to. The reasons for that are that it doesn't want to enter infinite loops (some dynamic urls can auto-generate new ones infinitely) so it limits the deepness of dynamic urls indexed and indexes less likely urls with tons of parameters in query string, second reason is that the huge amount of pages that can be generated by dynamic sites would make googlebot's job dangerous for these sites' servers' CPUs.

The idea of a SE optimized template is nice, actually i'm already working on one but i'm already on 2.2 version.

I'll remind to everyone too that having his forum indexed may not always be a good thing. Forums are full of PR holes with their huge amount of links and signatures, therefore before making his board indexable one has to ask himself the question if the board content is specific enough for its pages to appear in people's search results. If yes, or if the site is a forum only, it can be useful to make it indexable, but if the forum is the support for a content rich website it may be a bad thing for the content parts to have the forum indexable.

Just my 2 cents.

Quentin
The largest message boards on the web !
Web Design Library (coming soon)
Friends sites: Heroes of might and magic - Biometric security
R. U. Serious
Registered User
Posts: 830
Joined: Mon Feb 11, 2002 2:07 pm

Post by R. U. Serious »

ROAMER wrote: In PhpBB 2.0.5 the SIDs have been removed, (too many problems with invalid sessions etc).

SIDs are gone,


Nope, pal. I assume you are incorrect. There was not a general problem with SIDs. The 'problem' (or feature) that was introduced in 2.0.4 was passing the SID additionally to the cookie/Url thing as an hidden variable in forms, to further enhance security.
Now (in 2.0.5) this added form-variable has been removed (for regular users; it's still active for mods/admins). However that does not change the general behaviour of sessions.

@Acid Paul: I do like the idea of clean urls, and I do think it's great work, I'm sorry if my answers came across harsh! :)
Captain Pervert
Registered User
Posts: 48
Joined: Mon May 26, 2003 3:24 pm
Location: Netherlands
Contact:

Post by Captain Pervert »

Since I have a small forum and got nothing to lose, and Acid Paul sounded convinced his mod would work, I decided to use it. At first it didn't work because I put the htaccess file in my http root instead of /forum but now it works.

I'll just sit here and see what happens.

Edit: Question

The modified code in Googlemod #1 goes like this:

Code: Select all

   global $SID, $HTTP_SERVER_VARS; 

   if ( !empty($SID) && !eregi('sid=', $url) && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot') && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'[email protected];')) 
But I want at least one other searchengine added (one that is still alot used by Dutch people and my site is in Dutch as well). ia_archiver is the name for www.ilse.nl
Can I simply add it like this:

Code: Select all

   global $SID, $HTTP_SERVER_VARS; 

   if ( !empty($SID) && !eregi('sid=', $url) && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot') && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'[email protected];') && !strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'ia_archiver'))
?
Last edited by Captain Pervert on Mon Jun 30, 2003 12:03 pm, edited 1 time in total.
Locked

Return to “[2.0.x] MODs in Development”