Error HTTP 403 - Google fails to index forum pages

Get help with installation and running phpBB 3.2.x here. Please do not post bug reports, feature requests, or extension related questions here.
User avatar
Wuppi
Registered User
Posts: 19
Joined: Mon Jun 17, 2019 9:22 am
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by Wuppi »

exactly the same problem :/

But ONLY with Google und Bing. I've added Yandex to Bot-List -> no problem.

https://httpstatus.io/ -> my URL ... Googlebot => 403, Bingbot => 403, Yandexbot => 200

I've updated my board yesterday from 3.0 to 3.2.7 (now @ php 7.2)

An problem with permissions? Why did have Yandex ( compatible; YandexBot/3.0; +http://yandex.com/bots added to spider; and same with YandexBot/2.0) no problem? Permissions don't changed with 3.2.7. Bots can see, read etc. Checked. The funny thing is that only the two of them are affected. Yandex shows me that the rules work properly.
User avatar
Lumpy Burgertushie
Registered User
Posts: 69224
Joined: Mon May 02, 2005 3:11 am
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by Lumpy Burgertushie »

looks like there is something wrong with your SSL setup.

do you have a SSL certificate?
is it setup on the server properly?
I get a "no phpbb detected" however it is there.

do you have a redirect from some other place?

do you have the cookie secure set to yes if you are running SSL?

robert
Premium phpBB 3.3 Styles by PlanetStyles.net

I am pleased to announce that I have completed the first item on my bucket list. I have the bucket.
User avatar
Wuppi
Registered User
Posts: 19
Joined: Mon Jun 17, 2019 9:22 am
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by Wuppi »

Lumpy Burgertushie wrote: Thu Jul 04, 2019 3:39 pm looks like there is something wrong with your SSL setup.

do you have a SSL certificate?
is it setup on the server properly?
[...]
do you have a redirect from some other place?

do you have the cookie secure set to yes if you are running SSL?

robert
cookie-Domain: .numismatikforum.de
Name: nforum
Path: /
Secure: yes

Server and Domain:
Protokol: https
Domain: www.numismatikforum.de
Port: 443
Path: /

.htaccess has redirect from http->https ... of course ... works fine: http://www.numismatikforum.de -> https://www.numismatikforum.de

https://www.ssllabs.com/ssltest/analyze ... Results=on
A+

You are on the correct page?
I get a "no phpbb detected" however it is there.
correct page?

Our Board runs since 2002 ... yesterday updated from 3.0.7 to 3.2.7. No Problem to use for Guests, Regged Users, Bot like Yandex etc (which are in the Bot-Spider-List) ... only Google/Bing. Why ONLY this Bots ... why not Yandex (compatible; YandexBot/3.0; +http://yandex.com/bots < in Spider/Bot-List) ... you can check this @ https://httpstatus.io/ ... Has phpbb3 a second Bot-List? Without Yandex and not allowed to use for google/bing.....?! :(
User avatar
Wuppi
Registered User
Posts: 19
Joined: Mon Jun 17, 2019 9:22 am
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by Wuppi »

Something is wrong ...

ACP->Spider/Bots ... remove Googlebot (useragent: Googlebot) ... httpstatus.io says: 403 for Google
Add Googlebot with useragent: Googlebot/ <- httpstatus.io says: 200 for Google!
Ok ... Check => i remove / in Useragent. 403 again!

Mhhh i must set / in UserAgent in ACP-Spider/Bots

Bing:
Bing IS set as bingbot/ <- WITH / => 403
I remove Bingbot => 403
I Add Bingbot (UA: bingbot/) => 403
I remove / => 403

Whats wrong?
AND: my Board is open for Guest (read) ... If Bot is not recognized as such, is he a guest user? Then why doesn't he see the forum as a guest? Or does the board have to deal differently with BOTs from a TECHNICAL point of view? That is why it is IMPORTANT to recognize. Google and Bing are not recognized cleanly and do not get access as a guest... How do I turn Bing, Google etc. into RIGHT bots again?

EDIT: i've found the problem ... The STK-Version FOR 3.2.7 is buggy (Bot replace with new one from 3.2.7). There is an mismatch between DB-Table "bots" und "users" (different user_ids) ... Complete analysis of this problem in GERMAN (sry) here: https://www.phpbb.de/community/viewtopi ... 5&t=243251 ...
User avatar
Lumpy Burgertushie
Registered User
Posts: 69224
Joined: Mon May 02, 2005 3:11 am
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by Lumpy Burgertushie »

as far as I know the there is no official STK for phpbb 3.2.x

however, it is good to know there is a problem with it.

have you informed the creators of the STK for 3.2 ?


robert
Premium phpBB 3.3 Styles by PlanetStyles.net

I am pleased to announce that I have completed the first item on my bucket list. I have the bucket.
User avatar
3Di
I've Been Banned!
Posts: 17538
Joined: Mon Apr 04, 2005 11:09 pm
Location: I'm with Ukraine 🇺🇦
Name: Marco
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by 3Di »

2 years ago I wrote a tool which restores default bots for 3.2.x.
The tool takes care of everything the right way, like user IDs etc since uses native core code's logic.

That's a stand-alone script, you will need to login as a founder or administrator:
phpBB 3.2.x - Restores default BOTs and deletes any BOT you added to it. (Make a Backup first)
https://gist.github.com/3D-I/9380d7fbc2 ... 16ef8513e8

Instructions are in the file's header.
🆓 Free support for our extensions also provided here: phpBB Studio
🚀 Looking for a specific feature or alternative option? We will rock you!
Please PM me only to request paid works. Thx. Buy me a coffee -> Image
My development's activity º PhpStorm's proud user º Extensions, Scripts, MOD porting, Update/Upgrades
User avatar
Brf
Support Team Member
Support Team Member
Posts: 53401
Joined: Tue May 10, 2005 7:47 pm
Location: {postrow.POSTER_FROM}
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by Brf »

Simulating Googlebot, I am getting a regular HTTP-200 on your index and some of the forums. I am getting the expected http-403 on forums that Googlebot has no permissions on.
Du hast keine Berechtigung, dieses Forum zu lesen.
User avatar
Wuppi
Registered User
Posts: 19
Joined: Mon Jun 17, 2019 9:22 am
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by Wuppi »

Brf wrote: Fri Jul 05, 2019 3:59 pm Simulating Googlebot, I am getting a regular HTTP-200 on your index and some of the forums. I am getting the expected http-403 on forums that Googlebot has no permissions on.
Du hast keine Berechtigung, dieses Forum zu lesen.
Yes ...
/ => Every bot get an 200. Only Sub-Forums are 403. BUT i've fix this now (since ... uff 3-4 hours?) => 200 for Every Forum :)

3Di wrote: Fri Jul 05, 2019 1:51 pm The tool takes care of everything the right way, like user IDs etc since uses native core code's logic.

That's a stand-alone script, you will need to login as a founder or administrator:
phpBB 3.2.x - Restores default BOTs and deletes any BOT you added to it. (Make a Backup first)
I found your script - but i had no access to my webspace - so I looked at your script and understood the logic -> fixed it manually in the DB. There were duplicates in the table "bots" - with user_ids which were not in the "users" => this causes a 403. If you use your own template, even a 500er :=)

Improvement suggestion for your bot list: There is a bot that has "googlebot" in the UserAgent. It will then be recognized as a Googlebot. If you replace Googlebot with Googlebot/, it doesn't match anymore :)

UserAgent: SentiBot www.sentibot.eu (compatible with Googlebot) <- nice trick :)

Lumpy Burgertushie wrote: Fri Jul 05, 2019 1:30 pm as far as I know the there is no official STK for phpbb 3.2.x

however, it is good to know there is a problem with it.

have you informed the creators of the STK for 3.2 ?
correct - is not the official. I found it here in the forum with link to github. I don't have informed the creator yet. My english is not so good now to be able to describe the error so cleanly (and I'm still waiting for my test space to recreate the bug). By the way, there is a rights problem to this problem which was caused by the script. As an admin I was not allowed to delete things anymore (fixed). I now get access to the old forum and adjust the rights 1:1 :( next Bug: Reysnc Avatar = my Avatar was gone :). I should have deleted ONLY the addon columns from phpbb3.0 :=)
User avatar
3Di
I've Been Banned!
Posts: 17538
Joined: Mon Apr 04, 2005 11:09 pm
Location: I'm with Ukraine 🇺🇦
Name: Marco
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by 3Di »

Wuppi wrote: Fri Jul 05, 2019 5:50 pm Improvement suggestion for your bot list: There is a bot that has "googlebot" in the UserAgent. It will then be recognized as a Googlebot. If you replace Googlebot with Googlebot/, it doesn't match anymore
The purpose of the script is pretty clear, no need of improvements.

Again: Restores default BOTs and deletes any BOT you added to it.
🆓 Free support for our extensions also provided here: phpBB Studio
🚀 Looking for a specific feature or alternative option? We will rock you!
Please PM me only to request paid works. Thx. Buy me a coffee -> Image
My development's activity º PhpStorm's proud user º Extensions, Scripts, MOD porting, Update/Upgrades
User avatar
Forex Station
Registered User
Posts: 179
Joined: Thu Apr 06, 2017 2:26 pm
Location: Australia
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by Forex Station »

3Di wrote: Fri Jul 05, 2019 1:51 pm 2 years ago I wrote a tool which restores default bots for 3.2.x.
The tool takes care of everything the right way, like user IDs etc since uses native core code's logic.
This is really useful mate. Thanks for posting this up.
Highly-customized PhpBB board voted as one of the most influential trading sites in the world: forex-station.com 💬
User avatar
Wuppi
Registered User
Posts: 19
Joined: Mon Jun 17, 2019 9:22 am
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by Wuppi »

3Di wrote: Fri Jul 05, 2019 5:58 pm
Wuppi wrote: Fri Jul 05, 2019 5:50 pm Improvement suggestion for your bot list: There is a bot that has "googlebot" in the UserAgent. It will then be recognized as a Googlebot. If you replace Googlebot with Googlebot/, it doesn't match anymore
The purpose of the script is pretty clear, no need of improvements.

Again: Restores default BOTs and deletes any BOT you added to it.
Misunderstanding? MY bot list is corrected. No more 403s for bots. And I have added some new ones (e.g. Yandex ... No. 1-Searchengine in Russia)

I meant that YOUR script contains an incorrect entry:

Code: Select all

'Google [Bot]'				=> array('Googlebot', ''),
Googlebot/ <- better!

I had a visit from Senti with following UA

Code: Select all

SentiBot www.sentibot.eu (compatible with Googlebot)
SentiBot should match with your "Googlebot". If you change this to Googlebot/ there no match possible... (Again: I don't have a match here, because Googlebot/)

With Bing you set

Code: Select all

'Bing [Bot]'				=> array('bingbot/', ''),
"bingbot" would otherwise be enough :) Why did you put this here but not on google :)
User avatar
3Di
I've Been Banned!
Posts: 17538
Joined: Mon Apr 04, 2005 11:09 pm
Location: I'm with Ukraine 🇺🇦
Name: Marco
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by 3Di »

As I said I am using native core code, therefore my script is right.
https://github.com/phpbb/phpbb/blob/9e9 ... p#L59-L105

The purpose of my script is to delete all the added bots and rebuild the array as from the first installation.
At which point you can add anything you want via ACP, you don't need to use scripts anymore.

https://www.phpbb.com/support/docs/en/3 ... m_spiders/


add bots.png
🆓 Free support for our extensions also provided here: phpBB Studio
🚀 Looking for a specific feature or alternative option? We will rock you!
Please PM me only to request paid works. Thx. Buy me a coffee -> Image
My development's activity º PhpStorm's proud user º Extensions, Scripts, MOD porting, Update/Upgrades
koraldon
Registered User
Posts: 530
Joined: Sat Jun 30, 2007 12:42 pm

Re: Error HTTP 403 - Google fails to index forum pages

Post by koraldon »

3Di wrote: Fri Jul 05, 2019 1:51 pm 2 years ago I wrote a tool which restores default bots for 3.2.x.
The tool takes care of everything the right way, like user IDs etc since uses native core code's logic.

That's a stand-alone script, you will need to login as a founder or administrator:
phpBB 3.2.x - Restores default BOTs and deletes any BOT you added to it. (Make a Backup first)
https://gist.github.com/3D-I/9380d7fbc2 ... 16ef8513e8

Instructions are in the file's header.
Thank you - Used on an older forum I maintain for a friend, and worked like a charm!
I must wonder how the *** this was allowed to happen in the first place. My friend's forum was not indexed on google at all for the last few years due to this. He is not a techie, so he didn't catch it.
User avatar
thecoalman
Community Team Member
Community Team Member
Posts: 5876
Joined: Wed Dec 22, 2004 3:52 am
Location: Pennsylvania, U.S.A.
Contact:

Re: Error HTTP 403 - Google fails to index forum pages

Post by thecoalman »

Removing or adding an individual bot from the bots list would not stop a site from being indexed. If you remove the bot it reverts to guest.

When included in the bots list the pages are slightly different, e.g. there is no links to member profiles. Of course you can control access as well but it affects all bots.
“Results! Why, man, I have gotten a lot of results! I have found several thousand things that won’t work.”

Attributed - Thomas Edison
Post Reply

Return to “[3.2.x] Support Forum”