googleusercontent.com

Discussion of non-phpBB related topics with other phpBB.com users.
Forum rules
General Discussion is a bonus forum for discussion of non-phpBB related topics with other phpBB.com users. All site rules apply.
foxiedog
Registered User
Posts: 11
Joined: Fri Jul 31, 2020 10:26 am

googleusercontent.com

Post by foxiedog »

i've been getting lots of visits from a guest with the ip address shown here IP:17.254.73.34.bc.googleusercontent.com,
among many other similar ip's from googleusercontent.com which are all busily downloading every image on the forum !

having searched the web for information on googleusercontent.com, opinion seems to be divided between it being actually google who is doing the scraping, or that it is google customers hosted by google who are responsible,

can anyone shed any light on whether or not it is actually Google itself or some other user ?
and should i be taking any action to limit their visits to the forum.


heres a list of some of the ip's visiting the forum over a timespan of an hour or so.
104.196.171.87
34.118.202.149
34.132.183.144
34.136.36.60
34.138.141.23
34.139.49.113
34.148.152.168
34.148.206.89
34.148.3.82
34.172.131.34
34.67.240.65
34.73.254.17
34.73.196.188
34.73.25.244
34.74.198.21
34.74.252.185
34.74.73.86
34.75.24.35
35.202.216.87
35.202.53.218
35.222.25.59
35.237.188.42

this one is i believe, is not actually from google itself ?
35.233.62.116 (python requests)
User avatar
[Dimetrodon]
Registered User
Posts: 442
Joined: Tue Aug 30, 2022 3:29 am
Location: Paleozoic Era

Re: googleusercontent.com

Post by [Dimetrodon] »

foxiedog wrote: Wed Sep 07, 2022 8:41 pmcan anyone shed any light on whether or not it is actually Google itself or some other user ?
It absolutely is not from Google itself, as Google would show up as the Google [Bot], Google Feedfetcher, or Google Adsense, and it would come up as a bot, not as a guest. You're being scraped by bad bots.

Edit: As a temporary measure, I would go to my ACP > Users and Groups > Group Permissions > Guests > Post > Can download files; and setting that permission to No to block them from eating through your server's resources trying to download every attachment. I would probably consider doing the same for bots as well in case the person behind this attempts to impersonate a legitimate bot.
User avatar
HiFiKabin
Community Team Member
Community Team Member
Posts: 6731
Joined: Wed May 14, 2014 9:10 am
Location: Swearing at the PC, UK
Name: James

Re: googleusercontent.com

Post by HiFiKabin »

Not totally failsafe but I use the Bad Bot Blocking .htaccess code by Jeff Starr. Its available on his site and I have a download with it added it to the default pbpBB .htaccess file to make things easier

Source

phpBB htaccess with the bad bot code
foxiedog
Registered User
Posts: 11
Joined: Fri Jul 31, 2020 10:26 am

Re: googleusercontent.com

Post by foxiedog »

[Dimetrodon] wrote: Wed Sep 07, 2022 10:27 pm
foxiedog wrote: Wed Sep 07, 2022 8:41 pmcan anyone shed any light on whether or not it is actually Google itself or some other user ?
Edit: As a temporary measure, I would go to my ACP > Users and Groups > Group Permissions > Guests > Post > Can download files; and setting that permission to No to block them from eating through your server's resources trying to download every attachment. I would probably consider doing the same for bots as well in case the person behind this attempts to impersonate a legitimate bot.
thanks for the reply,
i had already got guest permission set to NO for "can download files", i just changed it to never, but they are still getting through, as shown here,
GuestIP: 34.121.215.91 » Whois
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0 Downloading file image 2.JPG


how they can still download the images if permissions are set to never has got me baffled ?
foxiedog
Registered User
Posts: 11
Joined: Fri Jul 31, 2020 10:26 am

Re: googleusercontent.com

Post by foxiedog »

HiFiKabin wrote: Thu Sep 08, 2022 9:22 am Not totally failsafe but I use the Bad Bot Blocking .htaccess code by Jeff Starr. Its available on his site and I have a download with it added it to the default pbpBB .htaccess file to make things easier

Source

phpBB htaccess with the bad bot code
Thanks, i have a look at that, :)
User avatar
[Dimetrodon]
Registered User
Posts: 442
Joined: Tue Aug 30, 2022 3:29 am
Location: Paleozoic Era

Re: googleusercontent.com

Post by [Dimetrodon] »

foxiedog wrote: Thu Sep 08, 2022 10:13 amhow they can still download the images if permissions are set to never has got me baffled ?
Then they cannot download the images. While they can try, and it would still come up as "downloading image," they're really looking at a "not authorized" page.

Still, HiFiKabin's .htaccess solution is an excellent idea of course.
foxiedog
Registered User
Posts: 11
Joined: Fri Jul 31, 2020 10:26 am

Re: googleusercontent.com

Post by foxiedog »

thanks, i did wonder if that might be the case, :)
User avatar
AmigoJack
Registered User
Posts: 6115
Joined: Tue Jun 15, 2010 11:33 am
Location: グリーン ヒル ゾーン

Re: googleusercontent.com

Post by AmigoJack »

HiFiKabin wrote: Thu Sep 08, 2022 9:22 amphpBB htaccess with the bad bot code
There are a few regular expressions where the dot is used as if it was meant literally, but it actually is interpreted (as "any character") instead of escaped (as "dot").
User avatar
HiFiKabin
Community Team Member
Community Team Member
Posts: 6731
Joined: Wed May 14, 2014 9:10 am
Location: Swearing at the PC, UK
Name: James

Re: googleusercontent.com

Post by HiFiKabin »

I will contact the code author for his comments
melvinarda
Registered User
Posts: 2
Joined: Mon Mar 27, 2023 4:31 am

Re: googleusercontent.com

Post by melvinarda »

Not totally failsafe but I use the Bad Bot Blocking .htaccess code by Jeff Starr. Its available on his site and I have a download with it added it to the default pbpBB .htaccess file to make things easier youtube apk

phpBB htaccess with the bad bot code
Thanks, i have a look at that, :)
im getting the same issue today, Bad Bot Blocking fix this issue, thanks.
Last edited by melvinarda on Wed Mar 29, 2023 3:49 am, edited 1 time in total.
User avatar
HiFiKabin
Community Team Member
Community Team Member
Posts: 6731
Joined: Wed May 14, 2014 9:10 am
Location: Swearing at the PC, UK
Name: James

Re: googleusercontent.com

Post by HiFiKabin »

AmigoJack wrote: Sun Sep 11, 2022 6:48 pm
HiFiKabin wrote: Thu Sep 08, 2022 9:22 amphpBB htaccess with the bad bot code
There are a few regular expressions where the dot is used as if it was meant literally, but it actually is interpreted (as "any character") instead of escaped (as "dot").
There is a new version currently in BETA which should address these issues.
User avatar
JLA
Registered User
Posts: 617
Joined: Tue Nov 16, 2004 5:23 pm
Location: USA
Name: JLA FORUMS

Re: googleusercontent.com

Post by JLA »

You should block anything that makes a request with python.
User avatar
JLA
Registered User
Posts: 617
Joined: Tue Nov 16, 2004 5:23 pm
Location: USA
Name: JLA FORUMS

Re: googleusercontent.com

Post by JLA »

Valid googlebot IPS can always be found here

https://developers.google.com/static/se ... lebot.json

Return to “General Discussion”