Page 1 of 2

Bot Question

Posted: Fri Aug 21, 2009 5:46 pm
by War Horse
With private forums (ones that require registration for users to see forums) is there any harm in having the googlebot (or bots in general) spider the site? If there is harm, is the best way to dissallow them access by "deactivating" them in the control panel?

Re: Bot Question

Posted: Fri Aug 21, 2009 5:49 pm
by Dog Cow
Do you think that Google bot will register for your site and log in to read the forums?

Re: Bot Question

Posted: Fri Aug 21, 2009 5:52 pm
by War Horse
Dog Cow wrote:Do you think that Google bot will register for your site and log in to read the forums?
:lol: No. Just don't see the reason to have the data and images within indexed anywhere.

Re: Bot Question

Posted: Fri Aug 21, 2009 5:53 pm
by Dog Cow
Make a robots.txt file and disallow the Googlebot user agent, and any other search robots if you want.

The phpBB control panel option won't stop them from coming.

Re: Bot Question

Posted: Fri Aug 21, 2009 5:55 pm
by Brf
If you disable the bots, they will be treated as unregisitered guests, but there is no need to do that. Just remove the Forum Permissions for the Bots usergroup, and they will not see the forums.

Re: Bot Question

Posted: Tue Aug 25, 2009 7:03 am
by mals69
To my knowledge Google will not spider any website material that is not open to the public, private forums are not spidered. Google wants their search results publicly available. :P

Re: Bot Question

Posted: Tue Aug 25, 2009 10:52 am
by Brf
That is not true.
As I said, the Bots group is separate from the Guests group. Therefore, bots can be given permissions to read forums that guests cannot and vice-versa

Re: Bot Question

Posted: Tue Aug 25, 2009 11:21 am
by mals69
Why would Google bothering spidering material that is not for public viewing that people have to take the time to register for, not be much of a search engine. Like I said to my knowledge, 3 1/2 years of Google bots and never spidered our private off topic forum but spiders the other 13 topics that are for public viewing.

Robot text files another thing all together, yes you can instruct them to go here and not go there on your site, but non public material Google makes up its own mind not to spider in my experience.

If other sites have private forums that somehow get spidered the private material will not be indexed in Google search results, so the topic starter has nothing to worry about in the first place unless on a tiny hosting plan and has to worry about bandwidth. :P

Re: Bot Question

Posted: Tue Aug 25, 2009 1:39 pm
by drathbun
More importantly, why would any board owner give a bot permission to read something private? Then all someone has to do is use google to read the private forums; doesn't make sense.

Re: Bot Question

Posted: Tue Aug 25, 2009 2:11 pm
by Roberdin
mals69 wrote:Why would Google bothering spidering material that is not for public viewing that people have to take the time to register for, not be much of a search engine. Like I said to my knowledge, 3 1/2 years of Google bots and never spidered our private off topic forum but spiders the other 13 topics that are for public viewing.
That's not how it works. When a bot (a member of the Bots group) visits a phpBB forum, phpBB recognises the "bot" as such, and grants the bot appropriate permissions. These permissions could be different to those of a "Guest", if you decided to configure it that way. In this case, the bot could spider topics that a guest could not see.

Re: Bot Question

Posted: Tue Aug 25, 2009 8:57 pm
by mals69
I agree with drathbun, like the last paragraph in my last post, if peoples private forums are being spidered it does not matter cause the spidering results will not show in Googles search index, so why have others in here bothered giving instructions on robot txt instructions for a private forum ? If some want to create un-necessary work for themselves be my guest. :) 8-)

Re: Bot Question

Posted: Wed Aug 26, 2009 3:56 am
by mals69
Got it wrong on Google not spidering private forums - sorry folks, :oops:

Googles system surely smart enough to know not to keep spidering a private forum ? It knows not to show any private forums posted material in their search results so why are private forums continually spidered and more importantly what is happening with the material from these private forums ? :?

Re: Bot Question

Posted: Wed Aug 26, 2009 6:20 am
by onehundredandtwo
To check if it's still private. ;)

I've noticed Googlebot will always index every part of one of my sites, it doesn't matter if it gets a phpBB error or not.

Re: Bot Question

Posted: Wed Aug 26, 2009 6:23 am
by Eelke
All Google does is load web pages and follow links on them. They don't have any magic way to get into any sites and they are not being "smart" about indexing anything; basically they index anything they can get to. In most cases, they would not see anything different than a regular web user that does not log into sites.

With that said, phpBB recognizes friendly bots like Google and allows you to assign them different permissions then regular guests, by assigning permissions to the bots group. That way, it's possible to allow Google to get into places where guests can not go. Is that smart? Probably not. The specific bots permissions are best used to only shield off things that really make no sense to spider, eventhough you want guest users to be able to see them (which is the other way around; bot permissions are usually a bit more restrictive than guest permissions, not less).

Re: Bot Question

Posted: Wed Aug 26, 2009 11:32 am
by Brf
mals69 wrote:Googles system surely smart enough to know not to keep spidering a private forum ? It knows not to show any private forums posted material in their search results
As Eelke is trying to explain, Google and other bots do not "know" anything. All they do is follow links.
If a usergroup is not given permissions on a forum, then there will be no link for follow. If you give the Bots usergroup permissions on a private forum, the link will show and Googlebot will follow and index that private forum. Those indexed links could probably not be followed by a guest, but the private pages would still be indexed and cached.

As for drathbun's question on motivation to do this... A searching guest might find an interesting subject, but not be able to see it except in cache. That guest might be motivated to register to participate in that subject, or at least read it in a non-cached page.