No response from Google as of yet. I have been looking through
Reduce the Googlebot crawl rate and did
file a special request to reduce the crawl rate with the hope that someone would notice and take action. My expectations are very low for any quick action or reply from them.
This is very irritating because on my main board Google's crawl stats show that on April 15th (last stats currently available) there was over 1M crawl requests made by GoogleOther. The board has about 700K posts and is already well indexed by GoogleBot. I am totally mystified by this new behavior by this bot.
Digging deeper and re-reading
Reduce Googlebot Crawl Rate | Google Search Central | Documentation | Google for Developers Google wrote:If you need to urgently reduce the crawl rate for short period of time (for example, a couple of hours, or 1-2 days), then return 500, 503, or 429 HTTP response status code instead of 200 to the crawl requests. Googlebot reduces your site's crawling rate when it encounters a significant number of URLs with 500, 503, or 429 HTTP response status codes
Looking up these status codes I discovered
429 Too Many Requests - HTTP | MDNMDN wrote:The HTTP 429 Too Many Requests response status code indicates the user has sent too many requests in a given amount of time ("rate limiting").
A Retry-After header might be included to this response indicating how long to wait before making a new request.
and
RFC 6585 - Additional HTTP Status Codes which gives this example code
Code: Select all
HTTP/1.1 429 Too Many Requests
Content-Type: text/html
Retry-After: 3600
<html>
<head>
<title>Too Many Requests</title>
</head>
<body>
<h1>Too Many Requests</h1>
<p>I only allow 50 requests per hour to this Web site per
logged in user. Try again soon.</p>
</body>
</html>
That looks promising as a direction to investigate. It would involve using the servers
.htaccess
file to check the
HTTP_USER_AGENT
variable against the problematic GoogleOther user-agent string and if they match then reply with the 429 status code.
I'm working on testing implementation details. Unfortunately my boards run on shared hosting so I don't have access to the Apache rate limiting settings and controls.
Normal people… believe that if it ain’t broke, don’t fix it. Engineers believe that if it ain’t broke, it doesn’t have enough features yet. – Scott Adams