Textreparser documentation?

Discussion forum for Extension Writers regarding Extension Development.
User avatar
Ger
Recognised Extension Developer
Posts: 1887
Joined: Wed Jan 02, 2008 7:35 pm
Location: 192.168.1.100
Contact:

Textreparser documentation?

Post by Ger » Wed May 30, 2018 9:48 am

I'm writing an extension (probably available in extension dev forum later today or tomorrow) that lets users search for posts in which they have been quoted. This works, but to be able to index old posts, I need to be sure that every post has been reparsed with the textreparser. If not, I want to point people to solid documentation about this.

I could of course write this on my own since I've worked with it a couple of times, but I'm not the author and neither am I a native English speaker/writer. So it would help if I could just link to some existing documentation an average board admin could comprehend. :)
I've searched on several places, but most of it seems to be intended for (extension) developers. Does somebody know of anything better?

-edit-
This topic has resulted in this documentation in the knowledge base
Last edited by Ger on Wed Jul 04, 2018 12:50 pm, edited 1 time in total.
My extensions:
Simple CMS, Feed post bot, Avatar Resize, Modbreak, Magic OGP, Live topic update, Modern Quote, Quoted Where (GDPR) and Autoresponder.
Newest: FAQ manager for 3.2

Like my work? Buy me a coffee to keep it coming. :ugeek:
-Available for custom work-

User avatar
david63
Registered User
Posts: 16688
Joined: Thu Dec 19, 2002 8:08 am
Location: Lancashire, UK
Name: David Wood
Contact:

Re: Textreparser documentation?

Post by david63 » Wed May 30, 2018 10:03 am

Ger wrote:
Wed May 30, 2018 9:48 am
Does somebody know of anything better?
I have never been able to find "simple" instructions for anything in that area - and I have looked.
David
Remember: You only know what you know and - you don't know what you don't know!
My CDB Contributions | How to install an extension
I will not be accepting translations for any of my extensions in Github - please post any translations in the appropriate topic.
No support requests via PM or email as they will be ignored

User avatar
Ger
Recognised Extension Developer
Posts: 1887
Joined: Wed Jan 02, 2008 7:35 pm
Location: 192.168.1.100
Contact:

Re: Textreparser documentation?

Post by Ger » Wed May 30, 2018 1:41 pm

I was afraid of that. Well then, guess I have to write my own. Be prepared for grammar and spelling flaws :P
My extensions:
Simple CMS, Feed post bot, Avatar Resize, Modbreak, Magic OGP, Live topic update, Modern Quote, Quoted Where (GDPR) and Autoresponder.
Newest: FAQ manager for 3.2

Like my work? Buy me a coffee to keep it coming. :ugeek:
-Available for custom work-

User avatar
david63
Registered User
Posts: 16688
Joined: Thu Dec 19, 2002 8:08 am
Location: Lancashire, UK
Name: David Wood
Contact:

Re: Textreparser documentation?

Post by david63 » Wed May 30, 2018 2:18 pm

Ger wrote:
Wed May 30, 2018 1:41 pm
I was afraid of that. Well then, guess I have to write my own. Be prepared for grammar and spelling flaws :P
I'll "proof read" it for you if you want
David
Remember: You only know what you know and - you don't know what you don't know!
My CDB Contributions | How to install an extension
I will not be accepting translations for any of my extensions in Github - please post any translations in the appropriate topic.
No support requests via PM or email as they will be ignored

User avatar
Ger
Recognised Extension Developer
Posts: 1887
Joined: Wed Jan 02, 2008 7:35 pm
Location: 192.168.1.100
Contact:

Re: Textreparser documentation?

Post by Ger » Wed May 30, 2018 2:40 pm

Thanks :)
My extensions:
Simple CMS, Feed post bot, Avatar Resize, Modbreak, Magic OGP, Live topic update, Modern Quote, Quoted Where (GDPR) and Autoresponder.
Newest: FAQ manager for 3.2

Like my work? Buy me a coffee to keep it coming. :ugeek:
-Available for custom work-

User avatar
Ger
Recognised Extension Developer
Posts: 1887
Joined: Wed Jan 02, 2008 7:35 pm
Location: 192.168.1.100
Contact:

Re: Textreparser documentation?

Post by Ger » Fri Jun 01, 2018 1:18 pm

Is there a way to determine if the reparser is finished or even started at all? I'm trying to wrap my head around this, but I actually find it a bit hard to comprehend.
JoshyPHP wrote:Sorry for the poke, but you know this better than anyone
My extensions:
Simple CMS, Feed post bot, Avatar Resize, Modbreak, Magic OGP, Live topic update, Modern Quote, Quoted Where (GDPR) and Autoresponder.
Newest: FAQ manager for 3.2

Like my work? Buy me a coffee to keep it coming. :ugeek:
-Available for custom work-

User avatar
JoshyPHP
Code Contributor
Posts: 1046
Joined: Mon Jul 11, 2011 12:28 am

Re: Textreparser documentation?

Post by JoshyPHP » Fri Jun 01, 2018 1:31 pm

There isn't a single textreparser service, just a collection of services under the textreparser tag that were originally created for the CLI. The relevant methods are phpbb\textreparser\reparser_interface::get_max_id() and phpbb\textreparser\reparser_interface::reparse_range(). Each service can be called to reparse a range of records, e.g. all posts between 0 to 100. That's all they do, their functionality is meant to be limited. http://area51.phpbb.com/phpBB/viewtopic ... 26&t=47666

There's a phpbb\textreparser\manager class that schedules reparsing by cron but I don't know how it works. I can't remember who wrote it but you can use git blame if you want to track the original author.

I subscribed to this topic so I'll follow the discussion but feel free to quote me if you need anything else.
I wrote the thing that does BBCodes in 3.2.

User avatar
Ger
Recognised Extension Developer
Posts: 1887
Joined: Wed Jan 02, 2008 7:35 pm
Location: 192.168.1.100
Contact:

Re: Textreparser documentation?

Post by Ger » Fri Jun 01, 2018 2:17 pm

Thanks, that helps me a bit I think.

I'm now in the process of reparsing a large test forum, and saw 2 different values in the config_text table for reparser_resume:

Direct after starting:

Code: Select all

array (
  'text_reparser.contact_admin_info' => 
  array (
    'range-min' => 1,
    'range-max' => 0,
    'range-size' => 100,
  ),
  'text_reparser.forum_description' => 
  array (
    'range-min' => 1,
    'range-max' => 0,
    'range-size' => 100,
  ),
  'text_reparser.forum_rules' => 
  array (
    'range-min' => 1,
    'range-max' => 0,
    'range-size' => 100,
  ),
  'text_reparser.group_description' => 
  array (
    'range-min' => 1,
    'range-max' => 0,
    'range-size' => 100,
  ),
  'text_reparser.pm_text' => 
  array (
    'range-min' => 1,
    'range-max' => 47994,
    'range-size' => 100,
  ),
  'text_reparser.poll_option' => 
  array (
    'range-min' => 1,
    'range-max' => 0,
    'range-size' => 100,
  ),
  'text_reparser.poll_title' => 
  array (
    'range-min' => 1,
    'range-max' => 201761,
    'range-size' => 100,
  ),
  'text_reparser.post_text' => 
  array (
    'range-min' => 1,
    'range-max' => 1074936,
    'range-size' => 100,
  ),
  'text_reparser.user_signature' => 
  array (
    'range-min' => 1,
    'range-max' => 78054,
    'range-size' => 100,
  ),
)
Now part done:

Code: Select all

array (
  'text_reparser.contact_admin_info' => 
  array (
    'range-min' => 1,
    'range-max' => 0,
    'range-size' => 100,
  ),
  'text_reparser.forum_description' => 
  array (
    'range-min' => 1,
    'range-max' => 0,
    'range-size' => 100,
  ),
  'text_reparser.forum_rules' => 
  array (
    'range-min' => 1,
    'range-max' => 0,
    'range-size' => 100,
  ),
  'text_reparser.group_description' => 
  array (
    'range-min' => 1,
    'range-max' => 0,
    'range-size' => 100,
  ),
  'text_reparser.pm_text' => 
  array (
    'range-min' => 1,
    'range-max' => 0,
    'range-size' => 100,
  ),
  'text_reparser.poll_option' => 
  array (
    'range-min' => 1,
    'range-max' => 0,
    'range-size' => 100,
  ),
  'text_reparser.poll_title' => 
  array (
    'range-min' => 1,
    'range-max' => 0,
    'range-size' => 100,
  ),
  'text_reparser.post_text' => 
  array (
    'range-min' => 1,
    'range-max' => 817036,
    'range-size' => 100,
  ),
  'text_reparser.user_signature' => 
  array (
    'range-min' => 1,
    'range-max' => 78054,
    'range-size' => 100,
  ),
)
(those were serailized but I unpacked them for readability)

The reparser is now busy with the post_text. As far as I can tell the range_max decreases at every batch of 100 (or whatever the range_size). So I think this means that I can do something like this:

Code: Select all

        $resume_data = $this->config_text->get('reparser_resume');
        if (empty($resume_data))
        {
            // Started as 3.2
            return true;
        }
        $state = unserialize($resume_data);
        if ($state)
        {
            foreach ($state as $part => $data)
            {
                if ($data['range_max'] > 0)
                {
                    // Some unfinished business
                    return false;
                }
            }
        }
        // All done
        return true;
Of course for my extension I only need the post text to be reparsed but I think this should cover it.
My extensions:
Simple CMS, Feed post bot, Avatar Resize, Modbreak, Magic OGP, Live topic update, Modern Quote, Quoted Where (GDPR) and Autoresponder.
Newest: FAQ manager for 3.2

Like my work? Buy me a coffee to keep it coming. :ugeek:
-Available for custom work-

User avatar
JoshyPHP
Code Contributor
Posts: 1046
Joined: Mon Jul 11, 2011 12:28 am

Re: Textreparser documentation?

Post by JoshyPHP » Fri Jun 01, 2018 2:28 pm

I think you can just use the manager to schedule the post reparser to be run via cron, e.g.

Code: Select all

$phpbb_container->get('text_reparser.manager')->schedule('post_text', 60);
I wrote the thing that does BBCodes in 3.2.

User avatar
3Di
Former Team Member
Posts: 14368
Joined: Mon Apr 04, 2005 11:09 pm
Location: Milan (IT) Frankfurt (DE)
Name: Marco
Contact:

Re: Textreparser documentation?

Post by 3Di » Fri Jun 01, 2018 2:32 pm

phpBB pages extension has a text reparser cron task if I am not wrong

phpbb.pages.text_reparser.page_text
Please PM me only to request paid works. Thx.
Want to compensate me for my interest? Donate
My development's activity º PhpStorm's proud user
Extensions, Scripts, MOD porting, Update/Upgrades
👨‍🏫 | Take a tour to | The Studio | 👨‍🏫

User avatar
Ger
Recognised Extension Developer
Posts: 1887
Joined: Wed Jan 02, 2008 7:35 pm
Location: 192.168.1.100
Contact:

Re: Textreparser documentation?

Post by Ger » Fri Jun 01, 2018 2:45 pm

JoshyPHP wrote:
Fri Jun 01, 2018 2:28 pm
I think you can just use the manager to schedule the post reparser to be run via cron, e.g.
That is very true, but it's not what I'm looking for.

I need this for my Quoted Where extension. Since it depends on posts being stored with your textformatter, I want 2 things there:
  1. If possible, determine if all posts are stored as parsed by your textformatter
  2. If not possible to determine or if not all posts are parsed as such, point admins to some clear documentation about how they should reparse their posts
My extensions:
Simple CMS, Feed post bot, Avatar Resize, Modbreak, Magic OGP, Live topic update, Modern Quote, Quoted Where (GDPR) and Autoresponder.
Newest: FAQ manager for 3.2

Like my work? Buy me a coffee to keep it coming. :ugeek:
-Available for custom work-

User avatar
3Di
Former Team Member
Posts: 14368
Joined: Mon Apr 04, 2005 11:09 pm
Location: Milan (IT) Frankfurt (DE)
Name: Marco
Contact:

Re: Textreparser documentation?

Post by 3Di » Fri Jun 01, 2018 2:53 pm

Have a look at the migration '\phpbb\db\migration\data\v320\text_reparser'

There are interesting bits like which configs are used by the reparser cron etc..
Please PM me only to request paid works. Thx.
Want to compensate me for my interest? Donate
My development's activity º PhpStorm's proud user
Extensions, Scripts, MOD porting, Update/Upgrades
👨‍🏫 | Take a tour to | The Studio | 👨‍🏫

User avatar
Ger
Recognised Extension Developer
Posts: 1887
Joined: Wed Jan 02, 2008 7:35 pm
Location: 192.168.1.100
Contact:

Re: Textreparser documentation?

Post by Ger » Fri Jun 08, 2018 2:48 pm

Right, I now have this:
https://github.com/GerB/quotedwhere/blo ... hp#L50-L55

Code: Select all

        $reparser_state = $config_text->get('reparser_resume');
        if (!empty($reparser_state))
        {
            $reparser_state = unserialize($reparser_state);
            $range_max = isset($reparser_state['text_reparser.post_text']['range_max']) ? $reparser_state['text_reparser.post_text']['range_max'] : false;
        }
If there is a range_max > 0 I show https://github.com/GerB/quotedwhere/blo ... re.php#L29

I think the correct further documentation for the admin would have to be something like this:

#################################
phpBB 3.2+ Text Reparser

In comparison to earlier versions of phpBB, phpBB 3.2 has a new Text Formatter which is required to cope with the various issues that there were with the old way BBCode was handled. One major difference, and the one that this topic focusses on, is the way messages* are stored. While the old engine stored messages as partly parsed BBcode and HTML, the new Text Formatter stores them as XML. You can read more about this here.

If you have started your board with version phpBB 3.2.0 or later, you can stop here because this does not relate to your board. If you started your board with an earlier phpBB version or if you have converted your board from other board software to phpBB, you should read on.

Why Reparse?
phpBB 3.2+ assumes your posts are stored as XML created by the Text Formatter. It does include a way to render old-school posts, so do not worry. Even more, phpBB also ships with a reparser that will convert all your old messages in small batches through a phpBB Cron task ("scheduled task"), therefore most of the time you do not have to worry and the more users that visit your board, the more quickly the messages will be reparsed and everything will be converted without you even noticing. In most cases this Cron task approach is perfectly fine. But be honest: would you have read up to here when you were satisfied with this?

Sometimes you may want to speed things up, enforcing the reparsing. Some extensions already rely on the new way messages are stored and their number will probably grow with time. Now you can choose to lay back and wait until everything is automatically reparsed, or force it and for that we need to use the CLI.

CLI? What's that?
CLI stands for Command Line Interface. A small but very powerful way to manage some tasks on your board. Read more about the CLI here. You use it from a Command prompt or a Terminal window. There are essentially two ways to approach this: directly on the server your board runs on, or remotely from your own computer, connecting through SSH for example.

There are numerous ways to achieve this - there is no single or best answer as it depends on how much control you have over your hosting package, the software they use, etc. The best thing I can do to help is linking to some documentation: However, if you do not know much about this, you wouldd probably need to ask your host.

From this point on, I will assume you have Command Line access to your board as it is a requirement to do anything else, so if you do not have Command Line access there is no point in reading on. I'll write each CLI command within code-tags, thus:

Code: Select all

Write this in your console
Now, let us reparse
We are almost there! Once you have logged in to your board using the command line, you will need to navigate to your board folder. Usually it is located in /home/yourusername/public_html/ or /var/www/hml/
So, let us go there:

Code: Select all

cd /var/www/html/
Now we can run commands. The easiest approach would be to simply reparse all the rich text:

Code: Select all

php bin/phpbbcli.php reparser:reparse --ansi
You will see some reporting on all the rich text fields in phpBB; a progress bar, some ranges, etc. In essence, that is all there is to it - just let it run until it is finished. When you have a large board this might take some time so be patient.

There are also some other options available:

You can tell the reparser to only reparse post text and leave the rest to the cron:

Code: Select all

php bin/phpbbcli.php reparser:reparse post_text --ansi
Or just reparse the PMs:

Code: Select all

php bin/phpbbcli.php reparser:reparse pm_text --ansi
Usually you will want to reparse everything.

You can also run in safe-mode, ignoring any extensions you might have and ignoring the cache:

Code: Select all

php bin/phpbbcli.php --safe-mode reparser:reparse  --ansi
Or even combine that:

Code: Select all

php bin/phpbbcli.php --safe-mode reparser:reparse post_text --ansi
Once the process is finished, you will get a nice success message. Job done, hooray! :D

On a sidenote: you can see I add a parameter --ansi after each CLI command. That is just to format the output in a more readable way.

There is a lock
You may get a message saying that the reparser is still running. That probably will not be the case, but there is a lock in place, which could be due to an unfinished cron job. You can reset that through your database using this query:

Code: Select all

UPDATE phpbb_config SET config_value = '0' WHERE config_name = 'reparse_lock';
(change the phpbb_ to your table prefix)
Keep in mind that it's always recommended to back up your database before running a SQL query manually.

You can then re-run your reparsing command through the CLI.

Now, that is about it. Quite easy in the end once you have reached the CLI, isn't it?


* The term messages, in this context, relates to any BBCode enriched text such as signatures, PMs, forum descriptions etc.


#################################

Any comments?

Changes:
  • Language improvements by David63
  • Corrected lock issue
  • Suggestions from 3Di (twice)
  • Suggestions from the team
Last edited by Ger on Tue Jun 26, 2018 12:19 pm, edited 4 times in total.
My extensions:
Simple CMS, Feed post bot, Avatar Resize, Modbreak, Magic OGP, Live topic update, Modern Quote, Quoted Where (GDPR) and Autoresponder.
Newest: FAQ manager for 3.2

Like my work? Buy me a coffee to keep it coming. :ugeek:
-Available for custom work-

User avatar
Ger
Recognised Extension Developer
Posts: 1887
Joined: Wed Jan 02, 2008 7:35 pm
Location: 192.168.1.100
Contact:

Re: Textreparser documentation?

Post by Ger » Sun Jun 10, 2018 12:59 pm

JoshyPHP wrote:
Fri Jun 01, 2018 1:31 pm
I subscribed to this topic so I'll follow the discussion but feel free to quote me if you need anything else.
david63 wrote:
Wed May 30, 2018 2:18 pm
I'll "proof read" it for you if you want
Does either of you have any corrections to above post?
My extensions:
Simple CMS, Feed post bot, Avatar Resize, Modbreak, Magic OGP, Live topic update, Modern Quote, Quoted Where (GDPR) and Autoresponder.
Newest: FAQ manager for 3.2

Like my work? Buy me a coffee to keep it coming. :ugeek:
-Available for custom work-

User avatar
JoshyPHP
Code Contributor
Posts: 1046
Joined: Mon Jul 11, 2011 12:28 am

Re: Textreparser documentation?

Post by JoshyPHP » Tue Jun 12, 2018 3:23 am

I don't know much about things I haven't personally written so if there's something wrong with it I won't notice it. I don't know what --safe-mode is supposed to do and I don't use it. There's an option called --dry-run that does not save the reparsed text. I don't know whether the CLI uses a lock and I hope not. I don't know why a lock exists.
I wrote the thing that does BBCodes in 3.2.

Post Reply

Return to “Extension Writers Discussion”