splitting merged words but www adresses (regexp)

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
Is there any way to split all merged words but www and e-mail addresses?

I have regexp

preg_replace("/(\.)([[:alpha:]])/", "\1 \2", "http://www.google.com">www.google.com
any,merged.words mymail@domain.com")

it give me incorrect result:
www. google. com any, merged. words mymail@domain. com

i need result
http://www.google.com">www.google.com any, merged. words mymail@domain.com

in my case, all web addresses has www. or http:// in beggining of string
and email of course @ inside string

is it possible to write regexp like this?

Re: splitting merged words but www adresses (regexp)

Quoted text here. Click to load it

No. You would use a lookbehind assertion in instances like these, but the
assertion has to be fixed length. Since a domain name can be of any number
of characters, you can't do it.

What you can do is first search for domain names and email addresses,
replacing them with some placeholders, fix the merged words, then replace
the placeholders again. Example:

function encode($m) { return "###" . base64_encode($m[0]) . "###"; }
function decode($m) { return base64_decode($m[1]); }

$s = "http://www.google.com">www.google.com any,merged.words mymail@domain.com";
$s = preg_replace_callback('/\bwww\.[\w\.]+/', 'encode', $s);
$s = preg_replace_callback('/\b[\w\.]+@[\w\.]+/', 'encode', $s);
$s = preg_replace('/([,.])(\w)/', ' ', $s);
$s = preg_replace_callback('/###(.*?)###/', 'decode', $s);

echo $s;

Re: splitting merged words but www adresses (regexp)

Dnia Tue, 28 Sep 2004 23:37:13 -0400, Chung Leong napisał(a):

Quoted text here. Click to load it

Thanks a lot! it is great solution I searched a long time!

Site Timeline