Giant word of warning about this, it's doesn't account for everything. Specifically with custom BBcodes, extensions like the media embed extension or even basic BBcodes. For example if someone wraps the bold tag around a URL it's going to leave the bold BBcode with empty content. If you want to remove a domain that is included in media extension, disabling it in ACP and re-parsing using CLI should allow you to use this SQL.
Use the SQL I posted at the top of the page to test. If you break something don't come crying to me, you've been warned. As always backup your DB.
If you are like me you have a lot of links from domains that are no longer going to original source. The user ends up at casino site, malware or who knows what. Identifying them is hard enough let alone removing them.
The following will replace most if not all occurrences of the various links auto generated by phpBB when posted as plain text, in
[url]
tags or
[img]
tags.
SET @domain_name = 'example.com';
This is really the only thing you need to edit. You only need the domain name and TLD. The pattern will match http/https, www/non-www and subdomains. I haven't tested it but if you use a subdomain here it should target URL's only with the subdomain.
You will also need to edit all occurrences of phpbb_posts if you are using different prefix for your tables.
Code: Select all
# Set the domain
SET @domain_name = 'example.com';
# Replacement text for links
SET @link_text_replacement = '**Link removed, it is no longer valid**';
# Replacement text for images
SET @image_text_replacement = '**Image removed, it is no longer valid**';
# Replacement text where domain name is just plain text without a link and has space before and after.
SET @plain_text_replacement = 'Example';
# Should match any common URL that includes the domain.tld excluding ftp:// etc.
SET @url_pattern = CONCAT('(https?://(?:www\\.|[a-z0-9-]+\\.)?', @domain_name, '(/[^<]*)?)');
# Matches [img] bbcode nested in [url=].
UPDATE phpbb_posts SET post_text = REGEXP_REPLACE (
post_text,
CONCAT('<URL url="', @url_pattern, '"><s>\\[url=', @url_pattern, '\\]</s><IMG src="', @url_pattern, '"><s>\\[img\\]</s>', @url_pattern, '<e>\\[/img\\]</e></IMG><e>\\[/url\\]</e></URL>'),
@image_text_replacement
);
# Matches [img] bbcode with long URL with shortened text, must go before link matching first because pattern below can match inner text.
UPDATE phpbb_posts SET post_text = REGEXP_REPLACE (
post_text,
CONCAT('<IMG src="', @url_pattern, '"><s>\\[img\\]</s><URL url="', @url_pattern, '"><LINK_TEXT text="', @url_pattern, '">', @url_pattern, '</LINK_TEXT></URL><e>\\[/img\\]</e></IMG>'),
@image_text_replacement
);
# Matches [img] bbcode with URL that is not shortened.
UPDATE phpbb_posts SET post_text = REGEXP_REPLACE (
post_text,
CONCAT('<IMG src="', @url_pattern, '"><s>\\[img\\]</s><URL url="', @url_pattern, '">', @url_pattern, '</URL><e>\\[/img\\]</e></IMG>'),
@image_text_replacement
);
# Matches [url] bbcode with either the URL or text used between tags.
UPDATE phpbb_posts SET post_text = REGEXP_REPLACE (
post_text,
CONCAT('<URL url="', @url_pattern, '"><s>\\[url\\]</s>(.*?)<e>\\[/url\\]</e></URL>'),
@link_text_replacement
);
# Matches [url=] bbcode with either the URL or text used between tags.
UPDATE phpbb_posts SET post_text = REGEXP_REPLACE (
post_text,
CONCAT('<URL url="', @url_pattern, '"><s>\\[url=', @url_pattern, '\\]</s>(.*?)<e>\\[/url\\]</e></URL>'),
@link_text_replacement
);
# Matches URL pasted as plain text that has been shortened
UPDATE phpbb_posts SET post_text = REGEXP_REPLACE (
post_text,
CONCAT('<URL url="', @url_pattern, '"><LINK_TEXT text="', @url_pattern, '">', @url_pattern, '</LINK_TEXT></URL>'),
@link_text_replacement
);
# Matches URL pasted as plain text that has not been shortened
UPDATE phpbb_posts SET post_text = REGEXP_REPLACE (
post_text,
CONCAT('<URL url="', @url_pattern, '">', @url_pattern, '</URL>'),
@link_text_replacement
);
# Matches domain.tld where it should be plain text at this point. If there is any links that have been missed it will at least break the link.
UPDATE phpbb_posts SET post_text = replace(post_text, @domain_name, @plain_text_replacement);
“Results! Why, man, I have gotten a lot of results! I have found several thousand things that won’t work.”
Attributed - Thomas Edison