Page 1 of 2

Soft 404's Webmaster Tools

Posted: Fri Apr 13, 2012 9:22 pm
by sk8rgui
I am trying to track down and find out what is causing soft 404 errors to appear in my Google Webmaster tools. Here is an example of one of the URLs throwing this error http://www.dadsdivorce.com/father_divor ... p?p=231973

Any ideas on how to fix this? Also, I have been searching and searching for a list of url query string parameters. What does the ?p= relate to? Does the p= denote a link to a particular post in the thread? Is there a list of what each parameter type is for?

Thanks for the help.

Re: Soft 404's Webmaster Tools

Posted: Fri Apr 13, 2012 10:53 pm
by stevemaury
Works fine for me.

Re: Soft 404's Webmaster Tools

Posted: Fri Apr 13, 2012 11:25 pm
by sk8rgui
stevemaury wrote:Works fine for me.
Yes the page loads fine, but why is it showing a soft 404 in webmaster tools?

Also, do you have any answer to my other question. What does the p= in the url stand for? Is there a list of what different variables mean? I know an f= in the url means forum = id#, but not sure about a lot of the others.

Re: Soft 404's Webmaster Tools

Posted: Fri Apr 13, 2012 11:44 pm
by stevemaury
"p" is the post_id. Your other issue you need to ask wherever support for webmaster tools is.

Re: Soft 404's Webmaster Tools

Posted: Sat Apr 14, 2012 12:23 am
by CaNNon_
Try the url with "Fetch as Googlebot tool", might give up a clue.
You'll find soft 404 mentioned in the help in the tools page not much info but worth a read. ;)

Re: Soft 404's Webmaster Tools

Posted: Sat Apr 14, 2012 12:53 am
by Pony99CA
sk8rgui wrote:Also, I have been searching and searching for a list of url query string parameters. What does the ?p= relate to? Does the p= denote a link to a particular post in the thread? Is there a list of what each parameter type is for?
To elaborate on what Steve said, the number in p=[number] is the post_id (from the phpbb_posts table). It's often followed by #p[number] (at the end of the URL) to force the browser to jump to that post on the page.

That number is independent of the topic (not "thread"). In other words, if you have you have a post_id of X in multiple topics, they'll be the exact same post. (I'm not sure if that can happen, but with copying or moving and leaving a shadow, it might.)

Steve

Re: Soft 404's Webmaster Tools

Posted: Sat Apr 14, 2012 2:44 am
by sk8rgui
Pony99CA wrote:
sk8rgui wrote:Also, I have been searching and searching for a list of url query string parameters. What does the ?p= relate to? Does the p= denote a link to a particular post in the thread? Is there a list of what each parameter type is for?
To elaborate on what Steve said, the number in p=[number] is the post_id (from the phpbb_posts table). It's often followed by #p[number] (at the end of the URL) to force the browser to jump to that post on the page.

That number is independent of the topic (not "thread"). In other words, if you have you have a post_id of X in multiple topics, they'll be the exact same post. (I'm not sure if that can happen, but with copying or moving and leaving a shadow, it might.)

Steve
So, would it be a bad idea to noindex, nofollow pages that contain a url with "p=" in it? It seems like it could raise a duplicate content issue with search engines, but I'm not 100% sure.

Re: Soft 404's Webmaster Tools

Posted: Sat Apr 14, 2012 3:12 am
by Oyabun1
Since the post_id for each post is unique if you have 100 links to 100 different topics they would all be unique with no duplication.

All the major search engines seem to have no problem correctly indexing phpBB.

I would say you would be better to spend your time on creating unique content for your site rather than worrying about tweaks that are of questionable SEO value and far less benefit for members.

Re: Soft 404's Webmaster Tools

Posted: Mon Jul 02, 2012 11:22 am
by sherya6
I too have got the same message from GWT. So somebody please tell me what all parameters can be set to be ignored by Google. I know of two: p and sid.

Re: Soft 404's Webmaster Tools

Posted: Mon Jul 02, 2012 12:25 pm
by Oyabun1
As previously stated, for what parameters to set in a third party tool you need to ask on a support site for that tool.

Re: Soft 404's Webmaster Tools

Posted: Mon Jul 02, 2012 12:29 pm
by AmigoJack
sk8rgui's link gives a HTTP status 200 when visiting with a Googlebot/ user agent, saying that authorization is missing. In other words: the board group Bots has no view permissions - this is what Google might classify as "soft 404".

Since I see this as a bug (a HTTP status 403 should be issued) I created ticket 10961.

Re: Soft 404's Webmaster Tools

Posted: Mon Jul 02, 2012 2:18 pm
by CaNNon_
Nice catch AmigoJack

Re: Soft 404's Webmaster Tools

Posted: Thu Aug 01, 2013 12:40 am
by SidV
AmigoJack wrote:sk8rgui's link gives a HTTP status 200 when visiting with a Googlebot/ user agent, saying that authorization is missing. In other words: the board group Bots has no view permissions - this is what Google might classify as "soft 404".

Since I see this as a bug (a HTTP status 403 should be issued) I created ticket 10961.
Hello AmigoJack, I don't think this is the case.
Google don't put 404 because has no view permissions here.
Google puts it because see that header code as server response.

Take this example in this forum:
If you (or someone else) go to:
viewtopic.php?f=26&t=2192160
Will see a real page (don't 404) that said:
The requested topic does not exist.
But; if Googlebot/ goes to the same URL, will see 404 as server response.
Why? Well, I don't know. If you don't believe me, you could see it as yourself (use internet tools to see servers response). I attach my screenshoot:
How GoogleBot see a deleted topic
How GoogleBot see a deleted topic
CS1.jpg (50.93 KiB) Viewed 2281 times
Well, about codes.
I found an interesting thread in other forum that said (that) the problem it's in the viewforum.php file:

Code: Select all

// Make sure $start is set to the last page if it exceeds the amount
if ($start < 0 || $start > $topics_count)
{
   $start = ($start < 0) ? 0 : floor(($topics_count - 1) / $config['topics_per_page']) * $config['topics_per_page'];
}
And the solution will be replace it with:

Code: Select all

// Make sure $start is set to the last page if it exceeds the amount
if ($start < 0 || $start > $topics_count)
{
    header("HTTP/1.0 404 Not Found");
    trigger_error('The page does not exist');
}
What do you think?
Can you test it?

Source; not spam:

Code: Select all

http://www.phpbb-seo.com/en/phpbb-seo-mods/article8723.html#p41361
Regards,
Sid

Re: Soft 404's Webmaster Tools

Posted: Thu Aug 01, 2013 2:29 am
by Oyabun1
SidV wrote:Hello AmigoJack, I don't think this is the case.
Google don't put 404 because has no view permissions here.
Google puts it because see that header code as server response.

Take this example in this forum:
If you (or someone else) go to:
viewtopic.php?f=26&t=2192160
Your example is a different situation. If you or a bot goes to a page that doesn't exist the board correctly returns a 404 Not Found status. The user agent could correctly connect to the server, but the server could not find the requested item, so 404 status is returned, that is as it should be.

In the OP's case the topic existed, but the Bots group did not have permission to see it and phpBB is, incorrectly, returning a 200 OK status, rather than a 403 Forbidden status.

Re: Soft 404's Webmaster Tools

Posted: Thu Aug 01, 2013 4:04 am
by SidV
Thanks Oyabun1 for your quick replay.
Now I could understand better the 404 response case.

But what about the solution idea posted?
If phpbb trigger_error('The page does not exist');
will google fetch anyway ?

What do you think?