|
Posted by Ben C on January 20, 2008, 4:11 pm
Please log in for more thread options > Scripsit Ben C:
[...]
>> Directionality should just work-- the characters are stored from
>> "start" to "end" and it's up to the browser to lay them out
>> right-to-left or left-to-right where appropriate.
>
> It's not quite that simple. Characters have inherent directionality, and
> browsers are supposed to observe it, but they sometimes fail, and then
> there's the problem that directionality is more than that. It also
> involves things like table column layout direction, default alignment
> (left vs. right), placement of vertical scroll bar, etc. Thus, anyone
> authoring in a right to left language should use <html dir="rtl">, and
> any texts with the opposite direction should have their own dir
> attribute.
I'm not saying browsers are perfect! They mostly don't implement the
rules for rtl quite correctly-- they don't for example all alter
margin-left rather than for margin-right when width properties are
overconstrained.
The point is just that I would expect the character order to be same
in the Word document, in the HTML, and to correspond to the order in
which the author typed the characters in. So it shouldn't present any
new or peculiar problems.
OP might have been worried that Word would have rearranged the actual
character order to do right-to-left. That isn't how it works, but it
isn't always so obvious unless you have more experience with these
things.
> I just made a simple text using Word 2002. I entered some Arabic
> letters, then asked Word to save it in HTML format, as filtered. Here's
> a key part of the output:
>
><meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
> ...
><p class=MsoNormal><span lang=AR-SA
> dir=RTL>ابتج</span></p>
>
> So Word decided, on my behalf, that the text is in Arabic as used in
> Saudi Arabia, and it inserted both a lang attribute and a dir attribute.
> Since it's using iso-8859-1 (due to its defaults), it converted the
> Arabic letters to character references. This is not that bad.
It could be worse.
>> An interesting question though is whether your authors have used
>> special characters like RLO and RLE, and whether if they have Word
>> will save them out as the Unicode characters.
>
> That might be a problem... but in my test, RLO doesn't seem to work even
> in Word, so why would an author use it?
Then all is well and the OP need not worry about these characters.
They do work (at least to some extent, I have not done extensive tests)
in Firefox and Opera. But on the www it's perhaps better to use
unicode-bidi properties instead: CSS 2.1 says they are supposed to work,
but as far as I know there's nothing that says that HTML UAs have to
support the RLO etc. characters.
|