Encoding issues

Discussion forum for Extension Writers regarding Extension Development.
Post Reply
User avatar
Ger
Recognised Extension Developer
Posts: 1593
Joined: Wed Jan 02, 2008 7:35 pm
Location: 192.168.1.100
Contact:

Encoding issues

Post by Ger » Wed Nov 15, 2017 9:25 am

Some people point out they have encoding issues with my Magic OGP extension. All the non-latin characters end up garbled.

I've tried my best to wrap my head around this, but somehow I don't have any of this problems on my test board. I've even cooked up a script to isolate the issue: https://gist.github.com/GerB/e9eccba5e1 ... 9d94515ee9
I've done many things: try to read out the encoding from the URL-headers, auto-detect encoding, enforcing the detection order, used iconv(), mb_convert_encoding(), even included the utf_tools.php and used the utf8_recode() function, but especially the first URL keeps me bugging: https://rockoverdose.gr/iron-maiden-athens-2018/

It identifies itself as UTF-8, identifies itself as English (whilst it's Greece...) and no matter what I do: somehow it gets garbled up.

Furthermore: while the second Greek URL and the Spanish URL (3rd on the list) work perfectly for me, they get garbled up for others. I've no way of testing this as far as my knowledge goes.

Does anyone have an idea how to proceed?
My extensions: Simple CMS, Feed post bot, Avatar Resize, Modbreak, Magic OGP, Live topic update and Modern Quote
Newest: Quoted Where + anonymize

Like my work? Buy me a coffee to keep it coming. :ugeek:
-Available for custom work-

User avatar
JoshyPHP
Code Contributor
Posts: 948
Joined: Mon Jul 11, 2011 12:28 am

Re: Encoding issues

Post by JoshyPHP » Wed Nov 15, 2017 1:19 pm

Prepare to be both relieved and pissed off:

Code: Select all

--- a/test.php
+++ b/test.php
@@ -136,7 +136,7 @@ class ogpParser
         // Prevent errors on bad documents, load HTML and reset error handling
         $old_libxml_error = libxml_use_internal_errors(true);
         $doc = new \DOMDocument('1.0', 'utf-8');
-        $doc->loadHTML($html);
+        $doc->loadHTML('<?xml encoding="utf-8"?>' . $html);
         libxml_use_internal_errors($old_libxml_error);
 
         $tags = $doc->getElementsByTagName('meta');
It's a known issue as I recall.
I wrote the thing that does BBCodes in 3.2.

User avatar
Ger
Recognised Extension Developer
Posts: 1593
Joined: Wed Jan 02, 2008 7:35 pm
Location: 192.168.1.100
Contact:

Re: Encoding issues

Post by Ger » Wed Nov 15, 2017 1:28 pm

I'll be darned. Image

The one that keeped bugging me is working for me now, I'll go check with the other users.

Thanks!!
My extensions: Simple CMS, Feed post bot, Avatar Resize, Modbreak, Magic OGP, Live topic update and Modern Quote
Newest: Quoted Where + anonymize

Like my work? Buy me a coffee to keep it coming. :ugeek:
-Available for custom work-

Post Reply

Return to “Extension Writers Discussion”

Who is online

Users browsing this forum: No registered users and 11 guests