Google Search Console And Indexing Issues

Get help with installation and running phpBB 3.3.x here. Please do not post bug reports, feature requests, or extension related questions here.
User avatar
Pfizz
Registered User
Posts: 86
Joined: Tue Aug 10, 2021 9:39 am

Google Search Console And Indexing Issues

Post by Pfizz »

We logged into our Google Search Console to find we have many pages on our site which aren't indexed. I guess it is impossible to fully appease Google's search engine on their page indexing requirements, but in one category I believe that it is showing we have over 133,000 pages which aren't indexed. See screenshot from our Google Search Console below.
Screen.jpg
We are most concerned about that top category "Alternate page with proper canonical tag" with the large number of pages that are seemingly being ignored by Google. From something I read it could be something related to UTM tags, but I don't believe we have any UTM tags in any of our site's posts.

Meanwhile, I am wondering if there is something about the native URL structure of phpBB sites that organically causes these types of indexing errors with Google and if there is any way to fix it?

In the past we noticed a lot more of our pages were indexed by Google and now so many of them aren't. However, we did searches on Bing for recent posts on our site and we found that Bing is far better at indexing almost everything.

We also have an extension installed called Advertising Management, which we aren't using at the moment so I just disabled it in case it might be causing any Google indexing issues.

Also, our site is running through Cloudflare, but I doubt this would have anything to do with Google Indexing issues.

Thank you.
You do not have the required permissions to view the files attached to this post.
Last edited by Mick on Mon Feb 13, 2023 9:22 am, edited 1 time in total.
Reason: Solved.
User avatar
danieltj
Infrastructure Team Member
Infrastructure Team Member
Posts: 614
Joined: Thu May 03, 2018 9:32 pm
Location: United Kingdom
Name: Daniel James

Re: Google Search Console And Indexing Issues

Post by danieltj »

Can you share a link to your forum and also an example of a few pages that are being label with this issue about canonical tags? It's possible that Google doesn't necessarily like phpBB's URL structures as you can get different URLs to the same page.

For example topic 123 might contain post 789, you can get to this topic and post by using either of these URLs:
  • /viewtopic.php?t=123
  • /viewtopic.php?p=789
Google might be seeing these are duplicates potentially. Without a link to your board though it's hard to say for certain.
User avatar
Pfizz
Registered User
Posts: 86
Joined: Tue Aug 10, 2021 9:39 am

Re: Google Search Console And Indexing Issues

Post by Pfizz »

Thank you. Here are 10 of the URLs being flagged by the "Alternate page with proper canonical tag" error from Google Search Console:

Code: Select all

/viewtopic.php?f=6&p=27994
/viewtopic.php?f=6&t=3424&view=previous
/viewtopic.php?t=7552&p=13110
/viewtopic.php?f=21&t=1659
/viewtopic.php?p=30907
/viewtopic.php?f=7&t=339
/viewtopic.php?f=26&t=16703&view=next
/viewtopic.php?p=36344
/viewtopic.php?p=3611
/viewtopic.php?f=18&t=21464
Following is also a list of all the extensions we have installed and the version numbers we are running for each of those extensions. Some may not be the latest versions as we haven't updated any extensions in a while. I wonder if perhaps any of these extensions could be contributing to the issue?
Add User 1.0.4
Auto Groups 2.0.2
Board Announcements 1.1.0
Change Post Time 1.0.1
CloudFlare IP 1.0.0
Contact Admin 1.3.7
Default Avatar Extended 1.2.2
Google Analytics 1.0.5
Group Template Variables 1.1.0
Hide Birthdays 1.0.1
Hide Newest User And Statistics Permissions 1.1.1-RC2
Joined date format 1.0.0
Last Post Avatar 1.0.3
LMDI Delete Re: 1.0.7
MailboxValidator Email Validator 1.0.0
phpBB3 SEO Sitemap 1.1.1
phpBB Media Embed PlugIn 1.1.1
PM Welcome 1.0.1
Post new topic 1.0.2
Previous / Next topic 1.0.3
Read other's topics Permission 1.1.0
Recent Topics 2.2.12
Round avatars 1.0.0
SEO Metadata 1.3.0
Show Guests in viewonline 0.2.0
Sortables Captcha 2.0.1
Who Is Online Extra Details by Informed Webmaster 0.9.3-BETA
User avatar
Mick
Support Team Member
Support Team Member
Posts: 26869
Joined: Fri Aug 29, 2008 9:49 am

Re: Google Search Console And Indexing Issues

Post by Mick »

I’d certainly be looking at updating the two SEO extensions along with Google analytics and any more that are out of date.
  • "The more connected we get the more alone we become” - Kyle Broflovski© 🇬🇧
User avatar
Pfizz
Registered User
Posts: 86
Joined: Tue Aug 10, 2021 9:39 am

Re: Google Search Console And Indexing Issues

Post by Pfizz »

OK, thanks. I have updated the following 3 extensions so far. Unless there is anything else I can do for now then I will give it some time and see if anything changes on Google's end.

SEO Metadata
phpBB3 SEO Sitemap
Google Analytics

I think to resubmit those 133,000+ un-indexed pages would require submitting them one by one. Maybe I will just see if I can get Google to do a full recrawl of the site now.

UPDATE: Actually, I was able to resubmit all those pages for review again to Google in the Google Search Console with just one click to validate them. So the re-validation has started and I am now awaiting to see the outcome, hopefully it will be positive. It seems all of those pages were flagged back in August of last year for some reason.
Last edited by Pfizz on Sun Feb 12, 2023 5:50 pm, edited 2 times in total.
HB
Registered User
Posts: 230
Joined: Mon May 16, 2005 9:30 pm

Re: Google Search Console And Indexing Issues

Post by HB »

Pfizz wrote: Sun Feb 12, 2023 6:52 amWe are most concerned about that top category "Alternate page with proper canonical tag" with the large number of pages that are seemingly being ignored by Google.
Google is indicating that it found the canonical tag and it's using that as "thee" URL. So Google isn't ignoring these URLs, it's indicating they are duplicates of other URLs.

The phpBB code generates A LOT of URLs that are duplicates. For example, Google will traverse every link on a page including the viewtopic.php?p=XXX for every post, but it's the same content, viewtopic.php?t=YYY. The base phpBB code tells search engines all these URL variants are the same content with the canonical metatag in overall_header.html:

Code: Select all

<!-- IF U_CANONICAL -->
	<link rel="canonical" href="{U_CANONICAL}">
<!-- ENDIF -->
TL;DR - you can ignore Google's "Alternate page with proper canonical tag" count -- it's not an error, it's working as designed.
Dan Kehn
User avatar
Pfizz
Registered User
Posts: 86
Joined: Tue Aug 10, 2021 9:39 am

Re: Google Search Console And Indexing Issues

Post by Pfizz »

Thank you for that information. That’s very helpful and puts my mind at ease.

I also just looked at how many pages in the Google Search Console that Google says they have indexed on our site and the number of pages indexed is basically the same number as the total number of topics we have on our site. In fact, Google’s number is just slightly higher than the number of topics we have. So maybe it’s all good. :)
HB
Registered User
Posts: 230
Joined: Mon May 16, 2005 9:30 pm

Re: Google Search Console And Indexing Issues

Post by HB »

Pfizz wrote: Sun Feb 12, 2023 5:46 pmIn fact, Google’s number is just slightly higher than the number of topics we have.
It makes sense that Google has slightly more pages indexed than the number of topics because a topic may have more than one page (e.g., viewtopic.php?t=XXX&start=20) and that's considered a "new" canonical URL.
Dan Kehn
User avatar
Pfizz
Registered User
Posts: 86
Joined: Tue Aug 10, 2021 9:39 am

Re: Google Search Console And Indexing Issues

Post by Pfizz »

HB wrote: Sun Feb 12, 2023 6:23 pm
Pfizz wrote: Sun Feb 12, 2023 5:46 pmIn fact, Google’s number is just slightly higher than the number of topics we have.
It makes sense that Google has slightly more pages indexed than the number of topics because a topic may have more than one page (e.g., viewtopic.php?t=XXX&start=20) and that's considered a "new" canonical URL.
Thank you. It seems Google has indexed only slightly more pages than the total number of topics we have.

We have about double the number of total posts to topics though. So the typical topic averages about two posts based on those numbers. Some topics have five to six posts. Some only one post.

A shame they won’t index every post though because each post has different information and isn’t repetitive just because it’s within the same topic. But I guess Google doesn’t see it that way.
SQLnovice
Registered User
Posts: 137
Joined: Thu Oct 10, 2019 5:03 am

Re: Google Search Console And Indexing Issues

Post by SQLnovice »

Pfizz wrote: Sun Feb 12, 2023 4:35 pmUPDATE: Actually, I was able to resubmit all those pages for review again to Google in the Google Search Console with just one click to validate them. So the re-validation has started and I am now awaiting to see the outcome, hopefully it will be positive. It seems all of those pages were flagged back in August of last year for some reason.
Don't be surprised when it comes back with exactly the same message; failed. I've been doing those validation fixes and haven't had much success until recently.

The Googlebot is on our small slow moving site 24/7, along with a mixed group of it's mobile and image crawlers, which come and go hourly.

We use SEO Sitemap too, but quickly learned not to let it just build an automatic list of all threads, all forums. Why, because Googlebot will find them without the sitemap. Instead we use sitemap.xml to tell Googlebot just the pages we feel are of the highest importance to Web searchers. So our total list is about 1/3rd of its original size. And true to form, Googlebot still wanders off into other forums. But for the most part, Who is Online shows that sitemap is keeping it focused mostly on the important forum content in the list.

So if you want to laser focus your Googlebot budget, have sitemap only generate forums that will make your site desirable to Web searchers, give priority to some, if they're more content rich. In fact, we've gone as far as adding about 30 sub-forum Removals we DON'T want Googlebot to scan for the next 6 months, we're that passionate about getting it to target just the important stuff first and foremost.

:idea: :idea: On the duplicate canonical, add HB's code below <title> in your overall_header.html file. :!: :!: That bit of code is gold to Googlebot, b/c now your sitemap and your pages will be in sync with one another, both saying the same thing, we don't care what else you think this page is called, it's called this and you, Googlebot, should index it as this!! We've been using that for about four months and we don't end up having to do any more validations. They're getting about 90% of the canonicals correct now, systematically dumping the others. It's still not as fast as a fresh start, but it's getting the job done.

We don't use the site URL Inspection the same way we once did either, b/c Googlebot's doing a fine job as it is. Instead, we use it to reindex important Top linked Pages as we update them, spending our budget there too on making sure our 30 most sought after threads contain the richest content we have available.
Clemnft
Registered User
Posts: 1
Joined: Wed Feb 15, 2023 2:40 pm

Re: Google Search Console And Indexing Issues

Post by Clemnft »

Search Console can be such a hot mess, thanks for the info here
phpuser
Registered User
Posts: 31
Joined: Sat Feb 16, 2008 7:04 pm

Re: Google Search Console And Indexing Issues

Post by phpuser »

Hello, this is a huge issue.

Can this be solved via core phpbb updates, not relying on external packages?

Thank you
User avatar
thecoalman
Community Team Member
Community Team Member
Posts: 6415
Joined: Wed Dec 22, 2004 3:52 am
Location: Pennsylvania, U.S.A.

Re: Google Search Console And Indexing Issues

Post by thecoalman »

SQLnovice wrote: Tue Feb 14, 2023 4:46 am We use SEO Sitemap too, but quickly learned not to let it just build an automatic list of all threads, all forums.
You can point Google at the news feeds to use as sitemap. ;)
phpuser wrote: Fri Oct 04, 2024 2:28 pm Hello, this is a huge issue.
It's not necessarily a "huge" issue. The bot follows the link and finds canonical tag indicating the source page, this avoids the duplicate content issue and what URL the bot will use in index. The only issue is the bot is wasting your resources and it's resources when it's following links to duplicate content when it could be indexing new content or re-indexing old content.

You can fix these problems by editing tepmplate and hiding the links from Google, many of them are already hidden. For example in viewtopic_body.html around line 231 find:

Code: Select all

			<a {% if postrow.S_FIRST_UNREAD %}class="first-unread" {% endif %}href="{{ postrow.U_MINI_POST }}">{{ postrow.POST_SUBJECT }}</a>
Replace with:

Code: Select all

			<!-- IF not S_IS_BOT -->
			<a {% if postrow.S_FIRST_UNREAD %}class="first-unread" {% endif %}href="{{ postrow.U_MINI_POST }}">{{ postrow.POST_SUBJECT }}</a>
			<!-- ELSE -->
			{{ postrow.POST_SUBJECT }}
			<!-- ENDIF -->
“Results! Why, man, I have gotten a lot of results! I have found several thousand things that won’t work.”

Attributed - Thomas Edison
phpuser
Registered User
Posts: 31
Joined: Sat Feb 16, 2008 7:04 pm

Re: Google Search Console And Indexing Issues

Post by phpuser »

Thank you very much!

Are there plans to do these changes in future release? :)
DzieXik
Registered User
Posts: 10
Joined: Mon Aug 01, 2022 10:14 am

Re: Google Search Console And Indexing Issues

Post by DzieXik »

thecoalman wrote: Fri Oct 04, 2024 11:42 pm You can fix these problems by editing tepmplate and hiding the links from Google, many of them are already hidden. For example in viewtopic_body.html around line 231 find:
Are there any other templates which need this fix?

Return to “[3.3.x] Support Forum”