[CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Converting from other board software? Good decision! Need help? Have a question about a convertor? Wish to offer a convertor package? Post here.
Scam Warning
Locked
User avatar
PPV
Registered User
Posts: 57
Joined: Tue Sep 30, 2008 3:08 am

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by PPV »

Hey there nneonneo!

You helped me 2 years ago with moving a huge IF forum. Thanks again for that!

But now I would like to backup a single thread from another forum. The thread is a big one (300 000 replies) and the forum is a public one. But the kicker is, the forum is vbulletin.

Is there any way one of your script could be modified to do that? I'd be willing to get my hand dirty and code that thing, I'd only need a bit of guidance.

Thanks for the help!

nneonneo
Registered User
Posts: 549
Joined: Sun Apr 30, 2006 1:42 am

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by nneonneo »

The converters as written are probably overkill for backing up a single thread. The basic idea for one topic is pretty simple:

1) download the first topic page, and determine the number of pages in the thread based on that (this requires a regular expression which can find the last page number from the HTML)
2) download the remaining topic pages, parsing all of the posts out of each page into a simpler form with a regular expression (or some other method, but regular expressions tend to be the simplest)
3) convert the resulting simplified data into the desired output format (for the converters, the input is a series of records delineated with <^>,</^> tags and with fields separated by <|>, and the output is SQL)

Basically, you need to write two regular expressions: one which can find out the number of pages (or, since you are converting a single thread, you could hardcode the number of pages), and a second expression to pull out the post data and convert it into a simpler form.

You can then worry about smaller issues like BBCode conversion, etc. Since VBulletin is entirely different from any other platform I've converted so far, no existing converter will have the right regular expressions, so you'll have to write them.

Here's a vastly simplified skeleton for this converter:

Code: Select all

from common import *
import re

COOKIEDATA = 'cookie data goes here'
NUMPAGES = 12000 # hardcoded

# fill this in
re_posts = "<p>Author: (.+?)<br/>Date: (.+?)<br/>Post: (.+?)"

posts = []
for page in xrange(1,NUMPAGES+1):
  statusline = "Page %i ... "%i
  progressline = statusline+"Downloading - "
  data = download_page("http://domain.com/showthread.php?...&page=%i"%i, progressline, COOKIEDATA)
  posts.extend(re.findall(re_posts, data))
At the end of this, posts is a list of all the posts in the thread, ready to be further processed.

Hope that helps!
Need a conversion from another forum, but they won't give you the database? Try a crawler converter. If your converter isn't listed, feel free to post in that thread to ask for one.

DJDunk
Registered User
Posts: 37
Joined: Tue May 04, 2010 5:25 pm

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by DJDunk »

Another disgruntled InvisionPlus board owner here :D

Since I'm unable to get a SQL backup from them, I'd like to explore the possibility of using a Crawler to convert to my existing PHP3 forum. Would this be possible? InvisionPlus runs IPB 1.3.1 as I understand.

Has a crawler conversion been attempted from InvisionPlus? If so, will I need to install PHPBB2.x to convert? My board is relatively small (6800 posts) and I only want to retain posts and their forums (no attachments, avatars, smilies, users etc). The InvisionFree boards look to be similar, so perhaps this could be tweaked to work?

Any advice very welcome :)

nneonneo
Registered User
Posts: 549
Joined: Sun Apr 30, 2006 1:42 am

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by nneonneo »

The set of changes seems to be quite minor. Do you usually access your forum via a URL like "http://www.invisionplus.net/forums/inde ... howforum=2" or a URL like "http://etcforum.invisionplus.net/index.php?showforum=2"?
Need a conversion from another forum, but they won't give you the database? Try a crawler converter. If your converter isn't listed, feel free to post in that thread to ask for one.

DJDunk
Registered User
Posts: 37
Joined: Tue May 04, 2010 5:25 pm

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by DJDunk »

nneonneo wrote:The set of changes seems to be quite minor. Do you usually access your forum via a URL like "http://www.invisionplus.net/forums/inde ... howforum=2" or a URL like "http://etcforum.invisionplus.net/index.php?showforum=2"?
Thanks nneonneo. It's the first example currently :) Used to be the second until the recent issues.

nneonneo
Registered User
Posts: 549
Joined: Sun Apr 30, 2006 1:42 am

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by nneonneo »

Yeah, I see they are having some issues right now.

Anyway, I've attached a modified InvisionFree converter, which can also handle InvisionPlus. Use the URL

Code: Select all

URL='http://www.invisionplus.net/forums/index.php?mforum=<boardname>&'
(note trailing &) to convert an InvisionPlus forum. Aside from that, you follow the instructions as usual, as if it were an InvisionFree forum (the two are basically the same with very minor changes).
Attachments
InvisionPlus.zip
InvisionPlus converter
(27.6 KiB) Downloaded 18 times
Need a conversion from another forum, but they won't give you the database? Try a crawler converter. If your converter isn't listed, feel free to post in that thread to ask for one.

DJDunk
Registered User
Posts: 37
Joined: Tue May 04, 2010 5:25 pm

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by DJDunk »

Thank you, I really appreciate it. I'm away for a few days, but look forward to testing it out as soon as I get back and will report my findings.

My board is currently up and running on the latest 3.0.7-PL1. Am I correct in understanding that I will need to create a PHPBB2.x board and then reconvert to my existing board afterwards?

The only other slight issue is that due to the issues, we're unable to access the ACP to change the settings listed at the start of the instructions - is this a showstopper?

DJDunk
Registered User
Posts: 37
Joined: Tue May 04, 2010 5:25 pm

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by DJDunk »

Edited ^

nneonneo
Registered User
Posts: 549
Joined: Sun Apr 30, 2006 1:42 am

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by nneonneo »

The settings you need to change are in your "My Controls" page, NOT in the Admin CP, so the unavailability of the Admin CP has no effect.
Need a conversion from another forum, but they won't give you the database? Try a crawler converter. If your converter isn't listed, feel free to post in that thread to ask for one.

DJDunk
Registered User
Posts: 37
Joined: Tue May 04, 2010 5:25 pm

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by DJDunk »

Great stuff, thanks. How about board versions?

nneonneo
Registered User
Posts: 549
Joined: Sun Apr 30, 2006 1:42 am

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by nneonneo »

Right. You need phpBB 2 to do the conversion; you can upgrade/convert it to phpBB 3 later.
Need a conversion from another forum, but they won't give you the database? Try a crawler converter. If your converter isn't listed, feel free to post in that thread to ask for one.

DJDunk
Registered User
Posts: 37
Joined: Tue May 04, 2010 5:25 pm

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by DJDunk »

Looking good so far 8-) Although, all topics/posts have gone in as forum_id = 0 . . . . is this normal behaviour?

nneonneo
Registered User
Posts: 549
Joined: Sun Apr 30, 2006 1:42 am

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by nneonneo »

Sorry about that; I seem to have missed that during testing. A fixed converter is attached.
Attachments
InvisionPlus.zip
InvisionPlus converter [Jun 3 2010]
(27.51 KiB) Downloaded 18 times
Need a conversion from another forum, but they won't give you the database? Try a crawler converter. If your converter isn't listed, feel free to post in that thread to ask for one.

DJDunk
Registered User
Posts: 37
Joined: Tue May 04, 2010 5:25 pm

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by DJDunk »

Not a problem, I did it manually in the end, but do have another forum to convert so this will come in handy.

I've converted all topics and posts, then merged them into my existing instance with no major issues.

The only niggles are:

1) Image tags - any way to fix this? They've converted like this . . .

(IMG:[url=http://www.images.com/image.jpg]http://www.images.com/image.jpg[/url])

I considered running SQL on the 'phpbb_posts' table but in the 'post_text' field it shows:

(IMG:[url=http&#58;//www&#46;images&#46;com/image&#46;jpg:g7211nrb]http://www.images.com/image.jpg[/url:g7211nrb])

. . . so I have no idea how to fix them with a script.

2) Smilies - although the BBCode is correct, existing smilies in posts don't show until the post has been edited and resubmitted.

If these cannot be fixed, then it's no major issue.

I can't thank you enough for this, you're a life saver 8-)

User avatar
bonelifer
Community Team Member
Community Team Member
Posts: 3500
Joined: Wed Oct 27, 2004 11:35 pm
Name: William
Contact:

Re: [CONVERT] Crawler Converters (Forumer, ZetaBoards, etc.)

Post by bonelifer »

You can reparse the BBCode via the Support Toolkit.

It can be found here:
http://www.phpbb.com/support/stk/
Knowledge Base | phpBB Board Rules | Search Customisation Database
Image
Please don't contact me via PM or email for phpBB support .

Locked

Return to “[3.0.x] Convertors”