Images Hijacked by Photobucket?

Discussion of non-phpBB related topics with other phpBB.com users.
Forum rules
General Discussion is a bonus forum for discussion of non-phpBB related topics with other phpBB.com users. All site rules apply.
v12mike
Registered User
Posts: 584
Joined: Thu Jul 09, 2015 5:03 pm

Re: Images Hijacked by Photobucket?

Post by v12mike »

This should help you identify the local file mane from the original photobucket url. You can then point your browser at the file in your images/ext directory to see what it looks like.
I hope you have a backup of the previously downloaded images.
chanlon1 wrote: Sun Sep 03, 2017 6:16 pm
v12mike wrote: Sun Sep 03, 2017 5:01 pm
chanlon1 wrote: Sun Sep 03, 2017 4:00 pm One question......if I have a photo bucket URL in one of my posts.....how do I work out which image in the images/ext folder it is trying to get???

Im not sure how to translate from the URL to the image filename.
The clue is in the line of code:

Code: Select all

$local_file_name = md5("$url");
Any program that will find the md5 hash of a string should produce the same result (or run the php md5 command from the command line).
Thanks. Was able to put the URL into a md5 generator and get the string and then tie it to the relevant file.

Appreciate the response.
KYPREO
Registered User
Posts: 392
Joined: Fri Feb 02, 2018 9:56 am
Contact:

Re: Images Hijacked by Photobucket?

Post by KYPREO »

Firstly, thank you immensely for sharing this script. I haven't run it yet, but I've been desperately looking for something like this. I haven't run it yet, as my forum is quite large with a 2 million post database and a lot of externally hosted images.

In the meantime, I have been thinking about how to recover other images that are no longer available on the Internet. For example, the Photobucket experience was not the first time this has happened. Imageshack was a very large and popular free hosting service that got taken over and decided to delete everyone's images without any notice whatsoever. I have probably tens of thousands of dead links to imageshack images on my board.

Previously, for very important topics on my board, I have looked up archived pages on the Wayback Machine (archive.org) to find whether the images were cached there. Very often they are, which has enabled me to edit old posts to point to the URL of the cached archive.org image. Voila - topic saved!

This is obviously a ridiculously laborious task to do manually so I did some more digging.

I found that archive.org has a number of APIs including a JSON API for querying whether a particular URL is available. If so, archive.org will then return a URL to the cached version. Now if this query was run for the URL to an image in a phpBB forum post, then the process of finding archived images and downloading the cached version from archive.org can be automated.

Eg; If you query http://imageshack.com/img001.jpg, archive.org will return http://web.archive.org/web/201309190446 ... img001.jpg

Details of the APIs are here: https://archive.org/help/wayback_api.php

Perhaps this could be done as a separate script or incorporated into the main one? For example if the script did not find the image hosted on the Internet, the archive.org query could be run to see whether it could downloaded from the archive.org URL instead.

I would love to do this, but I am a humble forum administrator with very little coding experience. But if someone was to do this, I think it would help resurrect important information on countless boards around the world.
Holger
Registered User
Posts: 1883
Joined: Tue Mar 12, 2002 3:54 pm
Location: Hannover

Re: Images Hijacked by Photobucket?

Post by Holger »

You still have the problem that archive.org is an external resource.
Maybe sometime some gov will decide that it is not allowed for archive.org to archive everything. I mean, how are they allowed to copy all images and texts? What about existing copyrights?
So, you might end up with dead links to archive.org later. Like for imageshack and photobucket.
KYPREO
Registered User
Posts: 392
Joined: Fri Feb 02, 2018 9:56 am
Contact:

Re: Images Hijacked by Photobucket?

Post by KYPREO »

Holger wrote: Thu Feb 08, 2018 8:13 am You still have the problem that archive.org is an external resource.
Maybe sometime some gov will decide that it is not allowed for archive.org to archive everything. I mean, how are they allowed to copy all images and texts? What about existing copyrights?
So, you might end up with dead links to archive.org later. Like for imageshack and photobucket.
Well that's where v12mike's script comes in. It grabs the external resource and hosts it locally. As it is now though, if the image is not hosted at the URL in the post text it won't have anything to download. What I'm proposing is another step where, if the image isn't available at its original location, a query is run against archive.org and if it's available there, the server will download the image from the archive.org address.

This definitely needs to be done because, as your post correctly identifies, the image might not stay on archive.org forever.

The IP question is an interesting one, particularly as I am an IP lawyer. Being an Australian lawyer, I can't speak authoritatively on the US position but the US has a very liberal fair use doctrine and specific exemptions for libraries and archiving. I also thought the issue was settled by the Guild v Google decision which examined whether Google infringed copyright through its Google Books program. Google also caches webpages and images (probably) billions of times a day. I'm fairly certain this has all been tested in the US courts and found to have constituted fair use.

As for my position as a website owner and administration, 99% of images I'm talking about are photos my users have taken themselves. A taker of a photograph is the author and owner of copyright under all the international copyright conventions. For about 16 years, my terms of use have specified that the user's agree to grant me an irrevocable, royalty-free licence to reproduce and publish any copyright work they own. If they don't own copyright, I have an indemnity in respect of any authorised use. If they uploaded the photos to imageshack or photobucket or whatever, then archive.org caches, then I download them and host them myself, then I am doing something that falls within my licence. ;)
Holger
Registered User
Posts: 1883
Joined: Tue Mar 12, 2002 3:54 pm
Location: Hannover

Re: Images Hijacked by Photobucket?

Post by Holger »

Very interesting, and thank you for sharing these insights!

I would be very interested in the text you use in your terms.
KYPREO
Registered User
Posts: 392
Joined: Fri Feb 02, 2018 9:56 am
Contact:

Re: Images Hijacked by Photobucket?

Post by KYPREO »

Holger wrote: Thu Feb 08, 2018 9:28 am Very interesting, and thank you for sharing these insights!

I would be very interested in the text you use in your terms.
Something like this....

By placing any information or other material on the Website (including posting messages, uploading files, inputting data, hyperlinked files stored on external servers that were created for the purpose of being displayed on the Website or engaging in any other form of communication), you grant to the Administrators a perpetual, royalty-free, non-exclusive, irrevocable, unrestricted, worldwide licence to do the following in respect of the information or material:

1. use, copy, sublicense, redistribute, adapt, transmit, publish and/or broadcast, publicly perform or display, and
2. sublicense to any third parties the unrestricted right to exercise any of the foregoing rights granted.

The foregoing grant includes the right to exploit all proprietary rights in any such information or other material, including but not limited to rights under copyright, trademark, service mark or patent laws under any jurisdiction worldwide. You expressly waive in favour of the Administrators and any other party authorised by Administrators all moral rights and any similar rights in any jurisdiction which you may have or hereafter acquire in respect of any relevant communication or other material. At the request and expense of the Administrators, you will execute and deliver to the Administrators such instruments and take such other actions as may be required to carry out this grant of licence and waiver.


If you are looking for other example, you will find similar boilerplate clauses on all commercial websites that allow users to contribute content.
Holger
Registered User
Posts: 1883
Joined: Tue Mar 12, 2002 3:54 pm
Location: Hannover

Re: Images Hijacked by Photobucket?

Post by Holger »

Thanks! I will check this!
User avatar
AmigoJack
Registered User
Posts: 6108
Joined: Tue Jun 15, 2010 11:33 am
Location: グリーン ヒル ゾーン
Contact:

Re: Images Hijacked by Photobucket?

Post by AmigoJack »

KYPREO wrote: Thu Feb 08, 2018 9:35 am... hyperlinked files stored on external servers that were created for ... engaging in any other form of communication ...
Which file should not be for communicating when accessible in the internet? That's the very point of the internet. Additionally I always have doubts about terms using names (capitalization) instead of nouns.
KYPREO
Registered User
Posts: 392
Joined: Fri Feb 02, 2018 9:56 am
Contact:

Re: Images Hijacked by Photobucket?

Post by KYPREO »

AmigoJack wrote: Thu Feb 08, 2018 11:54 am
KYPREO wrote: Thu Feb 08, 2018 9:35 am... hyperlinked files stored on external servers that were created for ... engaging in any other form of communication ...
Which file should not be for communicating when accessible in the internet? That's the very point of the internet. Additionally I always have doubts about terms using names (capitalization) instead of nouns.
This is a very old document and there are plenty of things that could be drafted differently. Also, the capitalised terms are defined elsewhere in the document. But thanks for the feedback.

But my post wasn't about whether I'm legally allowed to pull images from archive.org, it's more about preserving the content of my board without being at the mercy of external hosts.
User avatar
AmigoJack
Registered User
Posts: 6108
Joined: Tue Jun 15, 2010 11:33 am
Location: グリーン ヒル ゾーン
Contact:

Re: Images Hijacked by Photobucket?

Post by AmigoJack »

KYPREO wrote: Thu Feb 08, 2018 10:11 pmit's more about preserving the content of my board without being at the mercy of external hosts
Externally hosted files were never part of the content of your board to begin with, they were just referenced/linked. I understand your point, but "your" content is primarily only that on your own server(s). That's the very point of external hosters: they cannot have only advantages and no disadvantages.
v12mike
Registered User
Posts: 584
Joined: Thu Jul 09, 2015 5:03 pm

Re: Images Hijacked by Photobucket?

Post by v12mike »

For those who have used the scripts, and are using the Image Redirect extension to serve the harvested images, I have made v2.0.0-b1 of the extension for phpBB v3.2.x which is signifcantly optimised. Try it on a test board first.

See: viewtopic.php?f=456&t=2487136&p=15095001#p15095001
KYPREO
Registered User
Posts: 392
Joined: Fri Feb 02, 2018 9:56 am
Contact:

Re: Images Hijacked by Photobucket?

Post by KYPREO »

Photobucket announced a while back that it is no disabling direct hyperlinking of images, meaning browser fixes were no longer required. However, quite recently they have started placing large watermarks on images saying "hosted by photobucket". When using one of the browser extensions to fix the auto-redirect issue, it look like the raw image and not the watermarked version is displayed, which is good news.

I haven't tried with this script, but hopefully it pulls the non-watermarked version. I thought I would share here to bring to people's attention. v12mike: do you know if the method your script uses to bypass the redirect will ensure the non-watermarked version is pulled?
User avatar
Jayson
Registered User
Posts: 19
Joined: Thu Feb 21, 2019 11:05 am
Name: Jayson

Re: Images Hijacked by Photobucket?

Post by Jayson »

Classic example of why not to use a third party service. I'm not sure what can be done if they have their own Terms of Service.
v12mike
Registered User
Posts: 584
Joined: Thu Jul 09, 2015 5:03 pm

Re: Images Hijacked by Photobucket?

Post by v12mike »

KYPREO wrote: Mon Dec 03, 2018 12:01 am Photobucket announced a while back that it is no disabling direct hyperlinking of images, meaning browser fixes were no longer required. However, quite recently they have started placing large watermarks on images saying "hosted by photobucket". When using one of the browser extensions to fix the auto-redirect issue, it look like the raw image and not the watermarked version is displayed, which is good news.

I haven't tried with this script, but hopefully it pulls the non-watermarked version. I thought I would share here to bring to people's attention. v12mike: do you know if the method your script uses to bypass the redirect will ensure the non-watermarked version is pulled?
Sorry, I missed replying to this earlier (I was on holiday at the time and forgot about it).

It seems that it is now not possible to extract photobucket images without the watermark, so be thankful if you downloaded them earlier.
User avatar
2600
I've Been Banned!
Posts: 2567
Joined: Fri Nov 14, 2014 5:14 pm
Location: Area-51

Re: Images Hijacked by Photobucket?

Post by 2600 »

I haven't been reading the whole topic, but I just use an extension called AWS S3 which offloads attachments to AWS S3 storage. It costs me no more than 70 cents a month and that's having a bunch of other data there as well. BUT! You need to know how to use it and how to secure it. Once you do it's like having the greatest cloud-based segregation ever. So attachments will all load from a different URL rather than your host's server.

Once upon a time I used Photobucket which I liked to call Photo-F-It. Not anymore. If I'm sharing images on the Internet beyond my own forum I'll use imgur or imagesGuru. In fact, I think there was a hacker tool called Photo-F-It. I think it cracked user accounts. This was years ago though and I'm sure that vulnerability has been sense patched.

Note that the current extension version of AWS S3 doesn't work with PHP 7.2. I had someone recode it for me. I can share it if someone wants it.
Post Reply

Return to “General Discussion”