Encoding issues

Discussion forum for Extension Writers regarding Extension Development.
Post Reply
User avatar
Ger
Registered User
Posts: 1151
Joined: Wed Jan 02, 2008 7:35 pm
Location: 192.168.1.100
Contact:

Encoding issues

Post by Ger » Wed Nov 15, 2017 9:25 am

Some people point out they have encoding issues with my Magic OGP extension. All the non-latin characters end up garbled.

I've tried my best to wrap my head around this, but somehow I don't have any of this problems on my test board. I've even cooked up a script to isolate the issue: https://gist.github.com/GerB/e9eccba5e1 ... 9d94515ee9
I've done many things: try to read out the encoding from the URL-headers, auto-detect encoding, enforcing the detection order, used iconv(), mb_convert_encoding(), even included the utf_tools.php and used the utf8_recode() function, but especially the first URL keeps me bugging: https://rockoverdose.gr/iron-maiden-athens-2018/

It identifies itself as UTF-8, identifies itself as English (whilst it's Greece...) and no matter what I do: somehow it gets garbled up.

Furthermore: while the second Greek URL and the Spanish URL (3rd on the list) work perfectly for me, they get garbled up for others. I've no way of testing this as far as my knowledge goes.

Does anyone have an idea how to proceed?
Checkout my extensions: Simple CMS, Feed post bot, Modbreak, Magic OGP links and Live topic update

Like my work? Buy me a coffee to keep it coming. :ugeek:

User avatar
JoshyPHP
Code Contributor
Posts: 753
Joined: Mon Jul 11, 2011 12:28 am

Re: Encoding issues

Post by JoshyPHP » Wed Nov 15, 2017 1:19 pm

Prepare to be both relieved and pissed off:

Code: Select all

--- a/test.php
+++ b/test.php
@@ -136,7 +136,7 @@ class ogpParser
         // Prevent errors on bad documents, load HTML and reset error handling
         $old_libxml_error = libxml_use_internal_errors(true);
         $doc = new \DOMDocument('1.0', 'utf-8');
-        $doc->loadHTML($html);
+        $doc->loadHTML('<?xml encoding="utf-8"?>' . $html);
         libxml_use_internal_errors($old_libxml_error);
 
         $tags = $doc->getElementsByTagName('meta');
It's a known issue as I recall.
I wrote the thing that does the BBCodes in 3.2. Unless it broke yours, in which case it was somebody else with a similar name.

User avatar
Ger
Registered User
Posts: 1151
Joined: Wed Jan 02, 2008 7:35 pm
Location: 192.168.1.100
Contact:

Re: Encoding issues

Post by Ger » Wed Nov 15, 2017 1:28 pm

I'll be darned. Image

The one that keeped bugging me is working for me now, I'll go check with the other users.

Thanks!!
Checkout my extensions: Simple CMS, Feed post bot, Modbreak, Magic OGP links and Live topic update

Like my work? Buy me a coffee to keep it coming. :ugeek:

Post Reply

Return to “Extension Writers Discussion”

Who is online

Users browsing this forum: No registered users and 9 guests