GoogleOther misbehaving? Hundreds of board guest sessions active

Get help with installation and running phpBB 3.3.x here. Please do not post bug reports, feature requests, or extension related questions here.
User avatar
P_I
Community Team Member
Community Team Member
Posts: 2387
Joined: Tue Mar 01, 2011 8:35 pm
Location: Western Canada 🇨🇦

GoogleOther misbehaving? Hundreds of board guest sessions active

Post by P_I »

This afternoon I have noticed a sharp increase in guests on a couple of boards that I admin.

Digging deeper I'm seeing many instances from IP addresses 66.249.xxx.xxx. with the user-agent

Code: Select all

Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.6261.94 Mobile Safari/537.36 (compatible; GoogleOther) 
Checking Google Crawler (User Agent) Overview | Google Search Central  |  Documentation  |  Google for Developers
Google wrote:GoogleOther is the generic crawler that may be used by various product teams for fetching publicly accessible content from sites. For example, it may be used for one-off crawls for internal research and development.
The page also says
Google wrote:Google's common crawlers are used for building Google's search indices, perform other product specific crawls, and for analysis. They always obey robots.txt rules and generally crawl from the IP ranges published in the googlebot.json object.
The IP addresses I am seeing definitely are in the googlebot.json object. So I'm a bit confused. Perhaps someone at Google has programmed something wrong and triggered this storm versus expected one-off crawls for internal research.

Added: More information about this new bot -- Google launches a new crawler named GoogleOther

Are others seeing this also on their boards?
Normal people… believe that if it ain’t broke, don’t fix it. Engineers believe that if it ain’t broke, it doesn’t have enough features yet. – Scott Adams
User avatar
P_I
Community Team Member
Community Team Member
Posts: 2387
Joined: Tue Mar 01, 2011 8:35 pm
Location: Western Canada 🇨🇦

Re: GoogleOther misbehaving? Hundreds of board guest sessions active

Post by P_I »

In case other board admins are seeing this I have confirmed via Google Search Console that it is definitely Google and that it started about a week ago on my board.
Screenshot 2024-04-17 074210.png
Looking at my server awstats information shows April traffic seems to have a consistent pattern until April 10th when the Pages and Hits stats take a huge jump. Note the Number of visits doesn't jump, so it seems some visitor (or a few visitors) are responsible for all the increase.
Screenshot 2024-04-17 064827.png
My workaround is in ACP-> Spiders/Robots (Manage bots) I have added the bot Google [R&D bot] that matches the User-Agent of GoogleOther in order to session manage this new traffic.

Via Google Search Console's 'Submit feedback' I have inquired to see why the sudden massive jump in crawl requests.
You do not have the required permissions to view the files attached to this post.
Normal people… believe that if it ain’t broke, don’t fix it. Engineers believe that if it ain’t broke, it doesn’t have enough features yet. – Scott Adams
alexitesm
Registered User
Posts: 1
Joined: Thu Apr 18, 2024 8:56 pm

Re: GoogleOther misbehaving? Hundreds of board guest sessions active

Post by alexitesm »

Hi. I have the same problem.
Any response from Google regarding this?
User avatar
P_I
Community Team Member
Community Team Member
Posts: 2387
Joined: Tue Mar 01, 2011 8:35 pm
Location: Western Canada 🇨🇦

Re: GoogleOther misbehaving? Hundreds of board guest sessions active

Post by P_I »

No response from Google as of yet. I have been looking through Reduce the Googlebot crawl rate and did file a special request to reduce the crawl rate with the hope that someone would notice and take action. My expectations are very low for any quick action or reply from them.

This is very irritating because on my main board Google's crawl stats show that on April 15th (last stats currently available) there was over 1M crawl requests made by GoogleOther. The board has about 700K posts and is already well indexed by GoogleBot. I am totally mystified by this new behavior by this bot.

Digging deeper and re-reading Reduce Googlebot Crawl Rate | Google Search Central  |  Documentation  |  Google for Developers
Google wrote:If you need to urgently reduce the crawl rate for short period of time (for example, a couple of hours, or 1-2 days), then return 500, 503, or 429 HTTP response status code instead of 200 to the crawl requests. Googlebot reduces your site's crawling rate when it encounters a significant number of URLs with 500, 503, or 429 HTTP response status codes
Looking up these status codes I discovered 429 Too Many Requests - HTTP | MDN
MDN wrote:The HTTP 429 Too Many Requests response status code indicates the user has sent too many requests in a given amount of time ("rate limiting").

A Retry-After header might be included to this response indicating how long to wait before making a new request.
and RFC 6585 - Additional HTTP Status Codes which gives this example code

Code: Select all

HTTP/1.1 429 Too Many Requests
   Content-Type: text/html
   Retry-After: 3600

   <html>
      <head>
         <title>Too Many Requests</title>
      </head>
      <body>
         <h1>Too Many Requests</h1>
         <p>I only allow 50 requests per hour to this Web site per
            logged in user.  Try again soon.</p>
      </body>
   </html>
That looks promising as a direction to investigate. It would involve using the servers .htaccess file to check the HTTP_USER_AGENT variable against the problematic GoogleOther user-agent string and if they match then reply with the 429 status code.

I'm working on testing implementation details. Unfortunately my boards run on shared hosting so I don't have access to the Apache rate limiting settings and controls.
Normal people… believe that if it ain’t broke, don’t fix it. Engineers believe that if it ain’t broke, it doesn’t have enough features yet. – Scott Adams
User avatar
Talk19Zehn
Registered User
Posts: 863
Joined: Tue Aug 09, 2011 1:10 pm

Re: GoogleOther misbehaving? Hundreds of board guest sessions active

Post by Talk19Zehn »

Thanks to P_I and hello - I and other users have certainly found those hints: February 2023

Barry Schwartz on April 20, 2023 at 8:20 am
https://searchengineland.com/google-lau ... her-395827

and

Barry Schwartz on February 6, 2023 at 2:00 pm
https://searchengineland.com/google-lau ... ike-392729

Depending on who pursues which concept, one's own judgement may be evaluated differently.
Nevertheless, from my personal assessment, it is already informative for software users at that time and for the future, also phpBB, at that time.
Procedures, analyses, etc. are still unclear (?) ...

Regards
Best regards
phpBB3 Designs - My own works: Stylearea Ongray-Designs, Adventinducement-Calendar for phpBB
mishobehar
Registered User
Posts: 2
Joined: Tue Mar 19, 2019 4:11 pm

Re: GoogleOther misbehaving? Hundreds of board guest sessions active

Post by mishobehar »

Hey,
Did anyone manage to find a solution? I have a relatively small forum and its hit with hundreds of requests per second.

I sent a feedback through Google Search Console and I am waitng for them to act.

Otherwise i would be forced to rate limit (the hard way) the GoogleOther bot.

Anyone use Apache? Maybe mod_evasive would help, but this will block the bot entirely, so it might hurt SEO. Any thoughts on this?
User avatar
Mick
Support Team Member
Support Team Member
Posts: 26639
Joined: Fri Aug 29, 2008 9:49 am

Re: GoogleOther misbehaving? Hundreds of board guest sessions active

Post by Mick »

Have a look in Who is online and click the ‘Display guests’ link and see what you get. A screen cap of one of the pages you’re seeing may help.

Note: The majority of bots that have been causing problems lately are AI bots which are a fairly new phenomenon so shouldn’t effect SEO if you get rid of them as you’ve not had them visiting before anyway.
  • "The more connected we get the more alone we become" - Kyle Broflovski©
  • "The good news is hell is just the product of a morbid human imagination.
    The bad news is, whatever humans can imagine, they can usually create.
    " - Harmony Cobel
SQLnovice
Registered User
Posts: 122
Joined: Thu Oct 10, 2019 5:03 am

Re: GoogleOther misbehaving? Hundreds of board guest sessions active

Post by SQLnovice »

It comes in spurts about every 10 days or so, but then also do the weird guest accounts swarming the control panel, trying to register accounts, from what I gather. But those are easier to find, since they're mostly on our banned list, so they appear as looking at the control panel. :twisted:

I'm not at all concerned with our Registered BOTS group, which all Google BOTS are members.
GSC_Crawl_Requests.jpg

We used to be able to request Google crawler to reduce their rate of crawling, but now you can only Report a problem with overcrawling:
https://search.google.com/search-consol ... bot-report

You can also limit BOT access to your forum with the Show First Post Only to Guest extension, limiting how many characters within each topic the BOTS group and Guest accounts can view. This is especially useful in preventing AI BOTS from harvesting (stealing) your site's publicly available information and using it for its own purposes without giving your site credit.
You do not have the required permissions to view the files attached to this post.
User avatar
Mick
Support Team Member
Support Team Member
Posts: 26639
Joined: Fri Aug 29, 2008 9:49 am

Re: GoogleOther misbehaving? Hundreds of board guest sessions active

Post by Mick »

For the AI bots why not just create a robots.txt file or add the edits to .htaccess as mentioned elsewhere on the subject and be done with it?
  • "The more connected we get the more alone we become" - Kyle Broflovski©
  • "The good news is hell is just the product of a morbid human imagination.
    The bad news is, whatever humans can imagine, they can usually create.
    " - Harmony Cobel
SQLnovice
Registered User
Posts: 122
Joined: Thu Oct 10, 2019 5:03 am

Re: GoogleOther misbehaving? Hundreds of board guest sessions active

Post by SQLnovice »

Me? I use the bad_bots .htaccess code and robots.txt too, but the items I mentioned previously are on top of those measures. I don't want BOTs I don't know about or have yet to be invented from harvesting our forum's data, which is where Show First Post Only to Guest comes in. It limits what all unregistered users and BOTs can see.

p.s. Google's hitting our forum up again with crawl requests, averaging about 10x what the typically do over the past eight days. I can see some rather dramatic percentage upticks in OK (200) pages too. So, maybe they fired the 13 y.o programmers who were still carrying around nursery blankets and finally replaced them with educated programmers who can tie their shoe laces.

Return to “[3.3.x] Support Forum”