PCRE inconsistency

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

I'm trying to figure out why two servers give a different response
when preg_replacing a UTF-8 string.

Both machines: php 5.2.9, mbstring on...

My string is, in this example, a UTF-8 encoded pound sign (=A3)..

so...   $str = "=C2=A3"  ( literally \xc2\xa3 )

Machine one  :
preg_replace('/foo/', $str, $original)  becomes "\xc2\xa3" or, in
UTF-8, a pound sign.

Machine two :
preg_replace('/foo/', $str, $original)  becomes "\xe2\xa3" or garbage

  - note the first byte of the second example is \xe2 not \xc2 as I'm

As far as I can tell both machines are set up exactly the same, same
php.ini, same PCRE (7.8 I believe), same php (5.2.9)...   the only
difference is that the one that works is 32bit and the one that
doesn't is 64bit.

In both cases if I ouput the string using:

print htmlentities( $sFromRegex, ENT_COMPAT, 'UTF-8' );

... then I get £ as I guess you'd expect.

I don't know if this is the best place to post this but any advice (or
advice where else I could ask the question) would be greatly


Re: PCRE inconsistency

Quoted text here. Click to load it

Compare the output of the systems' "locale" command on each.

Site Timeline