Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
Tear my hair out time.

I have a csv file that contains text strings that I wish to display in a  
web page.

The csv file is utf-8, and the text strings include the british pound  
symbol encoded as two bytes 0xc2/0xa3

I'm using  


before reading the csv file, which I hope means that the csv file is read  
as utf-8.

Then I feed the string through htmlentities() before adding it to the web  

However, the web page that arrives at the client has £  
instead of just £.

I'm not sure where it's going wrong, partly because right now I may be  
too tired to work out where and how I can inspect the string without  
character encodings getting in the way.

If I print_r the data that has been read in to the web page, that shows  
ok, but at that point it's still utf-8, not an html entity.

The following is at http://www.sined.co.uk/tmp/pound.php and seems to  
demonstrate the issue:

setlocale(LC_CTYPE|LC_COLLATE, "en_GB.UTF8");
$str1 = "\xc2\xa3";
$str2 = htmlentities( "$str1" );
echo <<< EOT
<!doctype html>
<html lang="en">
    <meta charset="utf-8">
    <title>Broken Pound</title>

I'm not sure how to fix this. Ideas anyone?

Denis McMahon, denismfmcmahon@gmail.com

Re: encoding

On Thu, 26 Jun 2014 05:50:06 +0000, Denis McMahon wrote:

Quoted text here. Click to load it

Fix was:  

htmlentities( $string, ENT_COMPAT, "UTF-8" );

Not sure if I actually need the setlocale or not. Seems to work without  

ENT_HTML5 isn't supported in my server distro's current php (5.3) ...  
mutter mutter

Denis McMahon, denismfmcmahon@gmail.com

Re: encoding

Denis McMahon wrote:

Quoted text here. Click to load it

In PHP < 5.4 the default of the 3rd parameter is 'ISO-8859-1', so
setting this parameter appropriately is important when $string may
contain non ASCII characters.  For instance:

  htmlentities("\xC3\xA4", ENT_COMPAT, 'ISO-8859-1');
  // => '&Atilde;&curren;'

  htmlentities("\xC3\xA4", ENT_COMPAT, 'UTF-8');
  // => '&auml;'

Christoph M. Becker

Site Timeline