Images Hijacked by Photobucket?

Discussion of non-phpBB related topics with other phpBB.com users.
Forum rules
General Discussion is a bonus forum for discussion of non-phpBB related topics with other phpBB.com users. All site rules apply.
v12mike
Registered User
Posts: 168
Joined: Thu Jul 09, 2015 5:03 pm

Re: Images Hijacked by Photobucket?

Post by v12mike » Sun Feb 04, 2018 7:20 pm

This should help you identify the local file mane from the original photobucket url. You can then point your browser at the file in your images/ext directory to see what it looks like.
I hope you have a backup of the previously downloaded images.
chanlon1 wrote:
Sun Sep 03, 2017 6:16 pm
v12mike wrote:
Sun Sep 03, 2017 5:01 pm
chanlon1 wrote:
Sun Sep 03, 2017 4:00 pm
One question......if I have a photo bucket URL in one of my posts.....how do I work out which image in the images/ext folder it is trying to get???

Im not sure how to translate from the URL to the image filename.
The clue is in the line of code:

Code: Select all

$local_file_name = md5("$url");
Any program that will find the md5 hash of a string should produce the same result (or run the php md5 command from the command line).
Thanks. Was able to put the URL into a md5 generator and get the string and then tie it to the relevant file.

Appreciate the response.

User avatar
KYPREO
Registered User
Posts: 50
Joined: Fri Feb 02, 2018 9:56 am
Contact:

Re: Images Hijacked by Photobucket?

Post by KYPREO » Thu Feb 08, 2018 1:20 am

Firstly, thank you immensely for sharing this script. I haven't run it yet, but I've been desperately looking for something like this. I haven't run it yet, as my forum is quite large with a 2 million post database and a lot of externally hosted images.

In the meantime, I have been thinking about how to recover other images that are no longer available on the Internet. For example, the Photobucket experience was not the first time this has happened. Imageshack was a very large and popular free hosting service that got taken over and decided to delete everyone's images without any notice whatsoever. I have probably tens of thousands of dead links to imageshack images on my board.

Previously, for very important topics on my board, I have looked up archived pages on the Wayback Machine (archive.org) to find whether the images were cached there. Very often they are, which has enabled me to edit old posts to point to the URL of the cached archive.org image. Voila - topic saved!

This is obviously a ridiculously laborious task to do manually so I did some more digging.

I found that archive.org has a number of APIs including a JSON API for querying whether a particular URL is available. If so, archive.org will then return a URL to the cached version. Now if this query was run for the URL to an image in a phpBB forum post, then the process of finding archived images and downloading the cached version from archive.org can be automated.

Eg; If you query http://imageshack.com/img001.jpg, archive.org will return http://web.archive.org/web/201309190446 ... img001.jpg

Details of the APIs are here: https://archive.org/help/wayback_api.php

Perhaps this could be done as a separate script or incorporated into the main one? For example if the script did not find the image hosted on the Internet, the archive.org query could be run to see whether it could downloaded from the archive.org URL instead.

I would love to do this, but I am a humble forum administrator with very little coding experience. But if someone was to do this, I think it would help resurrect important information on countless boards around the world.

Holger
Registered User
Posts: 1603
Joined: Tue Mar 12, 2002 3:54 pm
Location: Hannover

Re: Images Hijacked by Photobucket?

Post by Holger » Thu Feb 08, 2018 8:13 am

You still have the problem that archive.org is an external resource.
Maybe sometime some gov will decide that it is not allowed for archive.org to archive everything. I mean, how are they allowed to copy all images and texts? What about existing copyrights?
So, you might end up with dead links to archive.org later. Like for imageshack and photobucket.

User avatar
KYPREO
Registered User
Posts: 50
Joined: Fri Feb 02, 2018 9:56 am
Contact:

Re: Images Hijacked by Photobucket?

Post by KYPREO » Thu Feb 08, 2018 9:20 am

Holger wrote:
Thu Feb 08, 2018 8:13 am
You still have the problem that archive.org is an external resource.
Maybe sometime some gov will decide that it is not allowed for archive.org to archive everything. I mean, how are they allowed to copy all images and texts? What about existing copyrights?
So, you might end up with dead links to archive.org later. Like for imageshack and photobucket.
Well that's where v12mike's script comes in. It grabs the external resource and hosts it locally. As it is now though, if the image is not hosted at the URL in the post text it won't have anything to download. What I'm proposing is another step where, if the image isn't available at its original location, a query is run against archive.org and if it's available there, the server will download the image from the archive.org address.

This definitely needs to be done because, as your post correctly identifies, the image might not stay on archive.org forever.

The IP question is an interesting one, particularly as I am an IP lawyer. Being an Australian lawyer, I can't speak authoritatively on the US position but the US has a very liberal fair use doctrine and specific exemptions for libraries and archiving. I also thought the issue was settled by the Guild v Google decision which examined whether Google infringed copyright through its Google Books program. Google also caches webpages and images (probably) billions of times a day. I'm fairly certain this has all been tested in the US courts and found to have constituted fair use.

As for my position as a website owner and administration, 99% of images I'm talking about are photos my users have taken themselves. A taker of a photograph is the author and owner of copyright under all the international copyright conventions. For about 16 years, my terms of use have specified that the user's agree to grant me an irrevocable, royalty-free licence to reproduce and publish any copyright work they own. If they don't own copyright, I have an indemnity in respect of any authorised use. If they uploaded the photos to imageshack or photobucket or whatever, then archive.org caches, then I download them and host them myself, then I am doing something that falls within my licence. ;)

Holger
Registered User
Posts: 1603
Joined: Tue Mar 12, 2002 3:54 pm
Location: Hannover

Re: Images Hijacked by Photobucket?

Post by Holger » Thu Feb 08, 2018 9:28 am

Very interesting, and thank you for sharing these insights!

I would be very interested in the text you use in your terms.

User avatar
KYPREO
Registered User
Posts: 50
Joined: Fri Feb 02, 2018 9:56 am
Contact:

Re: Images Hijacked by Photobucket?

Post by KYPREO » Thu Feb 08, 2018 9:35 am

Holger wrote:
Thu Feb 08, 2018 9:28 am
Very interesting, and thank you for sharing these insights!

I would be very interested in the text you use in your terms.
Something like this....

By placing any information or other material on the Website (including posting messages, uploading files, inputting data, hyperlinked files stored on external servers that were created for the purpose of being displayed on the Website or engaging in any other form of communication), you grant to the Administrators a perpetual, royalty-free, non-exclusive, irrevocable, unrestricted, worldwide licence to do the following in respect of the information or material:

1. use, copy, sublicense, redistribute, adapt, transmit, publish and/or broadcast, publicly perform or display, and
2. sublicense to any third parties the unrestricted right to exercise any of the foregoing rights granted.

The foregoing grant includes the right to exploit all proprietary rights in any such information or other material, including but not limited to rights under copyright, trademark, service mark or patent laws under any jurisdiction worldwide. You expressly waive in favour of the Administrators and any other party authorised by Administrators all moral rights and any similar rights in any jurisdiction which you may have or hereafter acquire in respect of any relevant communication or other material. At the request and expense of the Administrators, you will execute and deliver to the Administrators such instruments and take such other actions as may be required to carry out this grant of licence and waiver.


If you are looking for other example, you will find similar boilerplate clauses on all commercial websites that allow users to contribute content.

Holger
Registered User
Posts: 1603
Joined: Tue Mar 12, 2002 3:54 pm
Location: Hannover

Re: Images Hijacked by Photobucket?

Post by Holger » Thu Feb 08, 2018 9:49 am

Thanks! I will check this!

User avatar
AmigoJack
Registered User
Posts: 5153
Joined: Tue Jun 15, 2010 11:33 am
Location: グリーン ヒル ゾーン
Contact:

Re: Images Hijacked by Photobucket?

Post by AmigoJack » Thu Feb 08, 2018 11:54 am

KYPREO wrote:
Thu Feb 08, 2018 9:35 am
... hyperlinked files stored on external servers that were created for ... engaging in any other form of communication ...
Which file should not be for communicating when accessible in the internet? That's the very point of the internet. Additionally I always have doubts about terms using names (capitalization) instead of nouns.

User avatar
KYPREO
Registered User
Posts: 50
Joined: Fri Feb 02, 2018 9:56 am
Contact:

Re: Images Hijacked by Photobucket?

Post by KYPREO » Thu Feb 08, 2018 10:11 pm

AmigoJack wrote:
Thu Feb 08, 2018 11:54 am
KYPREO wrote:
Thu Feb 08, 2018 9:35 am
... hyperlinked files stored on external servers that were created for ... engaging in any other form of communication ...
Which file should not be for communicating when accessible in the internet? That's the very point of the internet. Additionally I always have doubts about terms using names (capitalization) instead of nouns.
This is a very old document and there are plenty of things that could be drafted differently. Also, the capitalised terms are defined elsewhere in the document. But thanks for the feedback.

But my post wasn't about whether I'm legally allowed to pull images from archive.org, it's more about preserving the content of my board without being at the mercy of external hosts.

User avatar
AmigoJack
Registered User
Posts: 5153
Joined: Tue Jun 15, 2010 11:33 am
Location: グリーン ヒル ゾーン
Contact:

Re: Images Hijacked by Photobucket?

Post by AmigoJack » Fri Feb 09, 2018 8:11 am

KYPREO wrote:
Thu Feb 08, 2018 10:11 pm
it's more about preserving the content of my board without being at the mercy of external hosts
Externally hosted files were never part of the content of your board to begin with, they were just referenced/linked. I understand your point, but "your" content is primarily only that on your own server(s). That's the very point of external hosters: they cannot have only advantages and no disadvantages.

Post Reply

Return to “General Discussion”

Who is online

Users browsing this forum: No registered users and 32 guests