Images Hijacked by Photobucket?

Discussion of non-phpBB related topics with other phpBB.com users.
Forum rules
General Discussion is a bonus forum for discussion of non-phpBB related topics with other phpBB.com users. All site rules apply.
Post Reply
johnnytype2
Registered User
Posts: 55
Joined: Sun Jan 08, 2017 8:16 pm

Re: Images Hijacked by Photobucket?

Post by johnnytype2 »

Zepticon wrote: Fri Aug 25, 2017 10:12 pm
johnnytype2 wrote: Fri Aug 25, 2017 10:09 pm all was going well but theres one address http://muchos.co.uk which works as http://www.muchos.co.uk and it stops at everyone

theres 1000s of instances in the db

is there an easy script i can run to replace all http://muchos.co.uk with http://www.muchos.co.uk ?


my php is crap sorry
Do a search and replace in the database, and replace all of them with something else :)

done that but the script seems to stop at every other address that isnt photobucket. ive been at this for hours and im only at 400 images having to manually change addresses in the Db to get it running each time.
:::EARLYBAY::: for fans of low light vans, 1967-1972
v12mike
Registered User
Posts: 584
Joined: Thu Jul 09, 2015 5:03 pm

Re: Images Hijacked by Photobucket?

Post by v12mike »

I have committed a new version of the script with an optional url filter that currently defaults to photobucket.com, so all other hosts will be ignored.
johnnytype2
Registered User
Posts: 55
Joined: Sun Jan 08, 2017 8:16 pm

Re: Images Hijacked by Photobucket?

Post by johnnytype2 »

thanks

the skip is working well for me thanks. im getting around 200 images each script run.
:::EARLYBAY::: for fans of low light vans, 1967-1972
johnnytype2
Registered User
Posts: 55
Joined: Sun Jan 08, 2017 8:16 pm

Re: Images Hijacked by Photobucket?

Post by johnnytype2 »

do i need to run the extract external links script more than once as i cant see some PB images linked on the site that i was using to test see if the script had worked. they are not in the external link database.? i have 12K in the linked and only 1K in the images, i feeel there should be about 10 times as many images as we have over 600K posts
:::EARLYBAY::: for fans of low light vans, 1967-1972
v12mike
Registered User
Posts: 584
Joined: Thu Jul 09, 2015 5:03 pm

Re: Images Hijacked by Photobucket?

Post by v12mike »

johnnytype2 wrote: Sat Aug 26, 2017 7:06 pm do i need to run the extract external links script more than once
Probably. When the script terminates, it tells you whether it needs to be run again.
johnnytype2 wrote: Sat Aug 26, 2017 7:06 pm i cant see some PB images linked on the site that i was using to test see if the script had worked. they are not in the external link database.? i have 12K in the linked and only 1K in the images, i feeel there should be about 10 times as many images as we have over 600K posts
If the image links are not in the database, then either the script needs to be run again, or there is a bug in the script (although that script seems to have worked properly for a number of users).

It does not hurt to run either script extra times, but once you start, you should re-run it until it says that it is all done.
johnnytype2
Registered User
Posts: 55
Joined: Sun Jan 08, 2017 8:16 pm

Re: Images Hijacked by Photobucket?

Post by johnnytype2 »

ive been running the download script for several hours now and im at about 10K images so im definitely doing something wrong.

Ill run the extract links again i was worried it would overwrite the current db
:::EARLYBAY::: for fans of low light vans, 1967-1972
webcomics4life
Registered User
Posts: 2
Joined: Mon Aug 28, 2017 2:49 am

Re: Images Hijacked by Photobucket?

Post by webcomics4life »

Figures. Just finished my own P-bucket script in Python and someone else has already done it sooner and better. Such is life.

Question1: Are the scripts good enough to run in production or should I setup a development forum and test it a few times first?

Question2: Can it be limited to only one domain? How about a list of domains?
v12mike
Registered User
Posts: 584
Joined: Thu Jul 09, 2015 5:03 pm

Re: Images Hijacked by Photobucket?

Post by v12mike »

webcomics4life wrote: Mon Aug 28, 2017 2:59 am
Question1: Are the scripts good enough to run in production or should I setup a development forum and test it a few times first?
It is safe to run on a production system, if you look at the scripts, you will see that they never attempt to write to any existing phpBB database tables, nor any standard phpBB directory, so a system can easily be restored (but that has not been necessary yet).
webcomics4life wrote: Mon Aug 28, 2017 2:59 am
Question2: Can it be limited to only one domain? How about a list of domains?
The latest version of the download_external_images.php script has a hard-coded url filter that can be easily edited. You could edit the filter for abc.com, run the script, then change the filter for xyz.com and run it again, or you could do something more sophisticated.
v12mike
Registered User
Posts: 584
Joined: Thu Jul 09, 2015 5:03 pm

Re: Images Hijacked by Photobucket?

Post by v12mike »

I have checked-in (yet another) version of download_external_images.php. This has added a check of the file mime-type and only allows 'image/' files. This prevents a non-image file being used as an attack vector.

If you have previously used this script, you can re-run with the new version, and it will delete any exoisting downloaded files with a bad mime-type. (set the url filter to 'http' unless you previously used another filter).
johnnytype2
Registered User
Posts: 55
Joined: Sun Jan 08, 2017 8:16 pm

Re: Images Hijacked by Photobucket?

Post by johnnytype2 »

i have a different number of external links 28000 to images 25000, im assuming theres something wrong?

both scripts have been run dozens of times and its stopped at these quotes. Am i correct in saying the image numer should equal the links number?

i still have lots of photobucket links that have not bee replaced...
:::EARLYBAY::: for fans of low light vans, 1967-1972
v12mike
Registered User
Posts: 584
Joined: Thu Jul 09, 2015 5:03 pm

Re: Images Hijacked by Photobucket?

Post by v12mike »

There will normally be more links than image files, as image links in quoted posts appear as duplicate links to the same file.

To troubleshoot image files that seem to be missing, first find a post with an in-line link to a missing image, open the post editor editor, copy the original image url and paste that url into a browser, and check that the image is actually accessible.

If the image is accessible, you should check to see if the link url is in the phpbb_external_images table in the data base.

If the url is in the database, you should find the associated local filename and download status. Then you should find a file of that name in your images/ext directory.

Knowing which step fails should give a good clue to your problem.
johnnytype2
Registered User
Posts: 55
Joined: Sun Jan 08, 2017 8:16 pm

Re: Images Hijacked by Photobucket?

Post by johnnytype2 »

thanks

seems like PB are hiding the original image even when i browse to it through the PB site its still showing on the thumbnail but not the main image which is strange. the main image is the pb 404 thumbnail
:::EARLYBAY::: for fans of low light vans, 1967-1972
v12mike
Registered User
Posts: 584
Joined: Thu Jul 09, 2015 5:03 pm

Re: Images Hijacked by Photobucket?

Post by v12mike »

Can you give an example of of a bad link?
deanmoke
Registered User
Posts: 24
Joined: Tue May 20, 2008 9:05 am

Re: Images Hijacked by Photobucket?

Post by deanmoke »

Great script!
Is there any way of excluding downloading from a particular domain?
Cheers,
Dean
v12mike
Registered User
Posts: 584
Joined: Thu Jul 09, 2015 5:03 pm

Re: Images Hijacked by Photobucket?

Post by v12mike »

deanmoke wrote: Fri Sep 01, 2017 7:55 am Is there any way of excluding downloading from a particular domain?
In download_external_images, you will find that there is a simple URL filtere defined:

Code: Select all

// only images with a url containing this string will be downloaded
define('URL_FILTER', 				'http');				// any host
//define('URL_FILTER', 				'.photobucket.com/');	// only photobucket.com
You can use that to restrict to a particular domain. If you want something more sophisticated, you may have to code it yourself.
Post Reply

Return to “General Discussion”