setlocale and regular expressions

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

Hi everybody.

I've got a problem with setting locale and regular expression.

I can set them without any trouble - I'm using LC_ALL category.

In Perl setting locale LC_CTYPE works also with regex but I was trying
to apply simple pattern [a-z] for polish characters just like this:

setlocale(LC_ALL, 'pl_PL');

// testing purposes
//$locale_info = localeconv();
//echo strftime("%A %e %B %Y", mktime(0, 0, 0, 16, 6, 2005));

if (preg_match('/^[a-z]$/', 'aezztest'))
    echo 'works ;-)';

Maybe I'm doing sth wrong?
Has anyone succeeded in using locale and regex?
'locale -a' gives me also pl_PL.iso88592 - but when I use it I get the
same result as pl_PL.

thanks in advance for any help
best regards

Re: setlocale and regular expressions

if (preg_match('/^[a-z]$/', 'aezztest'))
                                              ^^^^ wow google did some
magic here ;-)
                                                     polish characters
meant to be here (in place of 'aezztest'

what ever ;-) the problem is that this regex isn't working...

Re: setlocale and regular expressions

Quoted text here. Click to load it

Just in case you didn't know: Google's G2 thingy is extremly borken.

Quoted text here. Click to load it

That's because [a-z] matches is the equivalanet to [\x61-\x7a]

If you want to match lowercase extended characters in iso8859-2
encoding, you need to generate a character class like:
(and lots more I guess).

Re: setlocale and regular expressions

so the conclusion is:

Perl's locale != PHP's locale

if in Perl polish locale is set, [a-z] matches both (all) characters:
latin and polish

Maybe if PHP uses PCRE it also should use locale when matching
It would be extremely useful.


Site Timeline