Page 1 of 1

How is google seeing pages that guests can't see?

Posted: Tue Dec 04, 2018 5:40 pm
by jefmcg
Google has stumbled on and started indexing my new domain. I was troubled to see links to and caches of pages in my members-only section, so I looked at the permissions settings, and it looked like I was allowing guests and bots in where I didn't mean them to get. I've adjusted the forum permissions, and was settling down to wait for them to gradually disappear, but then I noticed other pages that guests can't but google can. Eg

https://webcache.googleusercontent.com/ ... clnk&gl=uk

How do I keep google (and anything else) out of the non-public parts of my site?

Thanks!

Re: How is google seeing pages that guests can't see?

Posted: Tue Dec 04, 2018 5:44 pm
by Brf
Google is not a "Guest". It is a "Bot".
You need to remove the "Bots" Group permission from your private forums.

Re: How is google seeing pages that guests can't see?

Posted: Wed Dec 05, 2018 5:48 pm
by jefmcg
Thanks for the quick reply.

Yes, I understand that, and I have fixed the permissions on the private parts of the forums. Hopefully they will disappear from google before my users notice.

But if you click on the link in my first post, you will find it's not a private forum but a list of global moderators. When you click on the direct link https://cyclingrelated.uk/memberlist.ph ... group&g=14 you go to a login page.

I can't see where to turn off that permission. I've had a look at group permissions for the bot group, and as far as I can tell they are all set to "no". Not sure where else to look.

tia

Re: How is google seeing pages that guests can't see?

Posted: Wed Dec 05, 2018 6:07 pm
by AmigoJack
Then you probably meant your board, not a forum.

As for your link: the cache looks fishy, as it has both: a "logout" and a "register" link - bots should not be able to log out at all. Groups can be seen by everyone unless they're in mode "hidden" and the viewing user has no administrative group management permission and is no member of the group.

Which means: either your "Bots" group or accounts in them accidentally had/inherited administrative group management permissions, or your board was modified to a point where this came up.

Re: How is google seeing pages that guests can't see?

Posted: Wed Dec 05, 2018 6:23 pm
by jefmcg
Update:

I still haven't got the forum permissions right. I just looked at the /viewonline.php and found that Google [Bot] was in a members only forum.

So I went ACP->Permissions->Forum permissions->Select a forum

I had assumed that if bots weren't under groups, then they wouldn't have permissions. I guess I was wrong. Can someone baby step me through the settings to keep google out of my private areas?
Capture.PNG

Re: How is google seeing pages that guests can't see?

Posted: Wed Dec 05, 2018 6:29 pm
by stevemaury
Make sure the Google [BOT] is in the Bots group and no other group. Then set the group forum permissions for all private forums to No access for the Bots group.

Re: How is google seeing pages that guests can't see?

Posted: Wed Dec 05, 2018 6:54 pm
by Brf
jefmcg wrote:
Wed Dec 05, 2018 6:23 pm
I just looked at the /viewonline.php and found that Google [Bot] was in a members only forum.
That does not necessarily mean they are seeing that forum. If they do not have permissions to "read" that forum, then they would be seeing a login page or an error message. But they should not be able to browse to a forum that they do not have permissions to "see", unless they are following a visible link from somewhere else.

Re: How is google seeing pages that guests can't see?

Posted: Wed Dec 05, 2018 7:45 pm
by stevemaury
phpBB uses sessions to keep track of users as they move between pages. The session information tells us who this user is. Therefore in order to determine what a user can do on a page we first need the session details. Once this data is available we can check whether the user is permitted to do whatever it is they are trying to do. This can result in it appearing as if a user is reading a topic in a forum they should not be able to access. Or perhaps viewing private messages when they are only guests, etc. In practice the user is not doing these things, they are viewing a "You are not permitted to do this" type message. The session data has simply been updated before we were able to determine what the user could or could not do.

Of course this only applies where permissions have been set correctly!

Re: How is google seeing pages that guests can't see?

Posted: Sun Dec 09, 2018 2:23 pm
by jefmcg
stevemaury wrote:
Wed Dec 05, 2018 6:29 pm
Make sure the Google [BOT] is in the Bots group and no other group. Then set the group forum permissions for all private forums to No access for the Bots group.
Thanks. That was my problem. All the bots were Registered Users. Is that default behaviour or an artifact of my brief foray into SMF before migrating to phpBB?

Re: How is google seeing pages that guests can't see?

Posted: Sun Dec 09, 2018 4:54 pm
by warmweer
jefmcg wrote:
Sun Dec 09, 2018 2:23 pm
Thanks. That was my problem. All the bots were Registered Users. Is that default behaviour or an artifact of my brief foray into SMF before migrating to phpBB?
Probably a leftover from SMF.
Bots shouldn't even have a user_id.

Re: How is google seeing pages that guests can't see?

Posted: Sun Dec 09, 2018 5:16 pm
by Paul
warmweer wrote:
Sun Dec 09, 2018 4:54 pm
jefmcg wrote:
Sun Dec 09, 2018 2:23 pm
Thanks. That was my problem. All the bots were Registered Users. Is that default behaviour or an artifact of my brief foray into SMF before migrating to phpBB?
Probably a leftover from SMF.
Bots shouldn't even have a user_id.
They should have a user_id, but they are treated as special user

Re: How is google seeing pages that guests can't see?

Posted: Sun Dec 09, 2018 5:33 pm
by warmweer
Paul wrote:
Sun Dec 09, 2018 5:16 pm
They should have a user_id, but they are treated as special user
Oops :oops: My mistake. Brain-malfunction. I clicked on a bots name in the Whoisonline and it didn't open so I assumed something without checking the users table (which I have done now locally), and thinking about it - it was a stupid mistake since I added some bots a couple of years ago with a script and had noticed that they were added in the users_table (which implies they must have a user_id).

Re: How is google seeing pages that guests can't see?

Posted: Mon Dec 10, 2018 2:59 pm
by thecoalman
When a bot (or anyone) makes a request for something they do not have permission for they are still listed as viewing it in viewonline.
jefmcg wrote:
Wed Dec 05, 2018 6:23 pm

I had assumed that if bots weren't under groups, then they wouldn't have permissions. I guess I was wrong.
This is correct. When a group is not listed under manage groups and is listed under add groups all permissions default to none.

Re: How is google seeing pages that guests can't see?

Posted: Mon Dec 10, 2018 3:51 pm
by jefmcg
thecoalman wrote:
Mon Dec 10, 2018 2:59 pm
This is correct. When a group is not listed under manage groups and is listed under add groups all permissions default to none.
Thanks. Yep, I've got it. Everything is explained by all the bots being in the Registered Users group. Fixed now.

Of course, I accidentally kicked a non-bot user out of that group at the same time :roll:
warmweer wrote:
Sun Dec 09, 2018 5:33 pm
Oops My mistake. Brain-malfunction. I clicked on a bots name in the Whoisonline
Jinx! I did exactly the same thing, and also assumed you couldn't do anything with a bot user, because you couldn't search for them on the user page. I finally opened the members page for every single group to "prove" that something was fundamentally wrong with phpBB, and then found what was wrong with my installation.