How can I block "AI"?

Discussion of non-phpBB related topics with other phpBB.com users.
Forum rules
General Discussion is a bonus forum for discussion of non-phpBB related topics with other phpBB.com users. All site rules apply.
Holger
Registered User
Posts: 1886
Joined: Tue Mar 12, 2002 3:54 pm
Location: Hannover

How can I block "AI"?

Post by Holger »

Hi there,
I am running several larger forums.
As I am very uncomfortable with ChatGPT and AI in general I want to block them from using my content to "learn" from.
How can I do that? Any ideas?

/H
John Little
Registered User
Posts: 36
Joined: Tue Aug 23, 2022 11:07 am

Re: How can I block "AI"?

Post by John Little »

I think if you hide your sensitive forums and make them only available to members, that should exclude robots - including AI - who would be "guests". Several of my forums are members only and they aren't listed unless you are signed in.
Holger
Registered User
Posts: 1886
Joined: Tue Mar 12, 2002 3:54 pm
Location: Hannover

Re: How can I block "AI"?

Post by Holger »

Well, that would block the forums content being listed by Google, thus preventing new users to find the forums. That is no good solution.
User avatar
KevC
Support Team Member
Support Team Member
Posts: 72559
Joined: Fri Jun 04, 2004 10:44 am
Location: Oxford, UK

Re: How can I block "AI"?

Post by KevC »

I doubt blocking 1 site is going to hinder them very much.
You'd need to know IPs and/or user agent IDs of the bots being used so you could block them at either a server level or with htaccess.
-:|:- Support Request Template -:|:-
Image
"Step up to red alert. Sir, are you absolutely sure? It does mean changing the bulb"
User avatar
AmigoJack
Registered User
Posts: 6120
Joined: Tue Jun 15, 2010 11:33 am
Location: グリーン ヒル ゾーン

Re: How can I block "AI"?

Post by AmigoJack »

Holger wrote: Thu Jun 06, 2024 7:43 amthat would block the forums content being listed by Google
So you think Google is not using what it crawled for AI purposes? It has even its own division Google AI, founded in 2017. And now you're getting "uncomfortable"?
Holger
Registered User
Posts: 1886
Joined: Tue Mar 12, 2002 3:54 pm
Location: Hannover

Re: How can I block "AI"?

Post by Holger »

I need Google for my forum to thrive. But I do not need OpanAI.
I know I am using AI already without knowing it, but if I can opt out, I will.
User avatar
danieltj
Infrastructure Team Member
Infrastructure Team Member
Posts: 515
Joined: Thu May 03, 2018 9:32 pm
Location: United Kingdom
Name: Daniel James

Re: How can I block "AI"?

Post by danieltj »

You can’t.

As long as your content is public - to allow Google to index it for search as you say - you cannot prevent your content from being used potentially by AI / ML programs.

It’s exactly the same thing as preventing right clicking to stop people saving images from your website. People can screenshot it, use a third party capture program or use a physical camera and take a photograph.

There’s always a way.
User avatar
P_I
Community Team Member
Community Team Member
Posts: 2437
Joined: Tue Mar 01, 2011 8:35 pm
Location: Western Canada 🇨🇦

Re: How can I block "AI"?

Post by P_I »

danieltj wrote: Thu Jun 06, 2024 9:10 am You can’t.
According to many of the articles found via https://www.google.com/search?q=block+ai+bots you can indeed block AI bots using robots.txt.
Normal people… believe that if it ain’t broke, don’t fix it. Engineers believe that if it ain’t broke, it doesn’t have enough features yet. – Scott Adams
User avatar
EA117
Registered User
Posts: 2171
Joined: Wed Aug 15, 2018 3:23 am

Re: How can I block "AI"?

Post by EA117 »

P_I wrote: Thu Jun 06, 2024 12:08 pm According to many of the articles found via https://www.google.com/search?q=block+ai+bots you can indeed block AI bots using robots.txt.
Except that's not what those articles say. Or they are at least quick to clarify, "adherence to a robots.txt is entirely voluntary." You're not "blocking" anything.

It's the equivalent of putting your valuable data out on a public sidewalk with a robots.txt Post-It note on it which says "Please do not steal this data."

You'll get people who don't steal it, people who tell you they didn't steal it but they actually did, and people who never even looked at the note while they stole the data.
User avatar
ssl
Registered User
Posts: 1979
Joined: Sat Feb 08, 2020 2:15 pm
Location: Le Lude, Pays de la Loire - France
Name: Fred Rimbert

Re: How can I block "AI"?

Post by ssl »

I blocked claudebot, anthropic-ai, ByteDance and Bytespider using the robots.txt file on my board and it was effective.
Holger
Registered User
Posts: 1886
Joined: Tue Mar 12, 2002 3:54 pm
Location: Hannover

Re: How can I block "AI"?

Post by Holger »

Better blocking those agents in the htaccess? That is controlled by the server and not voluntary?
User avatar
thecoalman
Community Team Member
Community Team Member
Posts: 6286
Joined: Wed Dec 22, 2004 3:52 am
Location: Pennsylvania, U.S.A.

Re: How can I block "AI"?

Post by thecoalman »

Holger wrote: Fri Jun 07, 2024 6:16 am Better blocking those agents in the htaccess? That is controlled by the server and not voluntary?
Typically if a bot is identifying itself they will behave according to robots.txt. That said user agents are often spoofed so blocking with .htaccess rule or other means can prevent those bots. If you going to do that I would allow for access to the robots.txt file.

The larger issue is the bots with browser user agents scraping content. They are difficult to stop. Cloudflare has an option for this but they have access to data from millions of sites to analyze and identify IP's for that type of traffic.
“Results! Why, man, I have gotten a lot of results! I have found several thousand things that won’t work.”

Attributed - Thomas Edison
User avatar
danieltj
Infrastructure Team Member
Infrastructure Team Member
Posts: 515
Joined: Thu May 03, 2018 9:32 pm
Location: United Kingdom
Name: Daniel James

Re: How can I block "AI"?

Post by danieltj »

EA117 wrote: Fri Jun 07, 2024 3:13 am
P_I wrote: Thu Jun 06, 2024 12:08 pm According to many of the articles found via https://www.google.com/search?q=block+ai+bots you can indeed block AI bots using robots.txt.
Except that's not what those articles say. Or they are at least quick to clarify, "adherence to a robots.txt is entirely voluntary." You're not "blocking" anything.
Exactly.

Which is what my point further into my post was talking about regarding photos where you try and prevent people from downloading them. At the end of the day, if someone wants to take the data from your website, they will find a way.
User avatar
thecoalman
Community Team Member
Community Team Member
Posts: 6286
Joined: Wed Dec 22, 2004 3:52 am
Location: Pennsylvania, U.S.A.

Re: How can I block "AI"?

Post by thecoalman »

danieltj wrote: Fri Jun 07, 2024 3:58 pm ....if someone wants to take the data from your website, they will find a way.
I'd agree but with the bots you can mitigate a lot of these issues. It really depends on how much time and effort you want to put into it. I mentioned Cloudflare, they have a lot of tools both automated and manual. Simple example using one of their tools, I have seen a lot scraper traffic form OVH in the past. You can block the entire IP range of a network like OVH with one entry using their ASN number.

You'll never get rid of all of them but you can eliminate the bulk of them.
danieltj wrote: Thu Jun 06, 2024 9:10 am It’s exactly the same thing as preventing right clicking to stop people saving images from your website. People can screenshot it, use a third party capture program or use a physical camera and take a photograph.
The easy way to do this is open the developer console. :D
“Results! Why, man, I have gotten a lot of results! I have found several thousand things that won’t work.”

Attributed - Thomas Edison
Holger
Registered User
Posts: 1886
Joined: Tue Mar 12, 2002 3:54 pm
Location: Hannover

Re: How can I block "AI"?

Post by Holger »

Return to “General Discussion”