The situation is basically what I wrote in the quoted text, just with
windows-1252 (Windows Latin 1) as the encoding. The encoding in unable to
represent any Arabic letters.
The encoding is specified in a <meta> tag, and HTTP headers are silent about
encoding, so it would be almost trivial to change the encoding to utf-8, by
modifying the <meta> tag and by replacing all non-ASCII characters (such as
the copyright sign) by entity or character references (such as ©).
ASCII data constitutes utf-8 data too.
But there's probably much more to be done on the server side, in the form
handler (confirmation.php). It would need to be modified so that it can read
utf-8 data and process it meaningfully.
The bad news is that PHP does not support utf-8 yet, except in fairly
1) Let the page be windows-1252 encoded, and just get prepared to getting
stuff like ب. If you pass them into an HTML document, _without_
encoding the "&" in any way, they will appear as the characters they denote
by HTML rules. (This is actually the way people have built, probably by
accident, a poor man's Unicode support to one of the most popular web-based
discussion forums in Finland, suomi24.fi.) There is no guarantee that this
will work, but it happens to work in most situations.
2) Make the Arabic page windows-1256 (Windows Arabic) or iso-8859-6 (ISO
Latin/Arabic) encoded. Your form handler will then get Arabic letters in the
specified 8-bit encoding. This in principle restricts input to characters
representable in the chosen encoding, but in practice you usually get a
&#number; stuff for other characters.
P.S. Your form has a single-line input field for "Address", which is
probably for a postal address, since you also have "E-mail". Normally you
should reserve a textarea of six lines for input of a postal address, but in
this case, _if_
you include the postal address input (why?), then I think
you should have two textareas, one for the address in Latin letters and one
for the eventual address in the local writing system. According to the
International Postal Union, a letter sent e.g. to an Arabic-speaking country
from abroad should have the recipient address in two ways, in Latin letters
and in Arabic letters.
Jukka K. Korpela ("Yucca")