XML encoding

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

I've written a PHP class which can output a UK position as XML.

The code snippet is as follows:

$crlf = "\r\n";

$xml = "";
$xml = $xml."<?xml version=\"1.0\" encoding=\"UTF-8\"?>".$crlf;
$xml = $xml."<UK_Location xmlns:xsi=\"www.w3.org\">".$crlf;
$xml = $xml."<OS_X>".$this->osX."</OS_X>".$crlf;
// Convert the XML into the UTF-8 text encoding scheme
$xml = utf8_encode($xml);

The output from Firefox is as expected and as I want:

<?xml version="1.0" encoding="UTF-8"?>
<UK_Location xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance ">
<Postcode>W1D 5BT</Postcode>

But from IE, I get a '-' at the start of the 2nd line.

   <?xml version="1.0" encoding="UTF-8" ?>
- <UK_Location xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance ">
   <Postcode>W1D 5BT</Postcode>

This makes the XML invalid. Why do I get this and what can I do about it?  
Please note the '-' is there even if I leave the utf8_encode() line out,  
my initial supposition was that the utf8_encode() was causing the problem.


Re: XML encoding

however I just discovered if I save the file from IE, the '-' disappears!  
This PC only has IE 6.0 on it, might it just be an older version of IE XML  
rendering problem?

Re: XML encoding

Quoted text here. Click to load it

No, it is IE's way of telling you that you can unfold that region. Just
click the '-'.

Have a nice weekend,
Willem Bogaerts

Application smith
Kratz B.V.

Re: XML encoding

Willem Bogaerts emailed this:
Quoted text here. Click to load it

Oops. Spot the person who almost never uses IE !!

Thanks for that.

Re: XML encoding


Quoted text here. Click to load it

Just some additional "style hints":

1) Is there any specific need for a CRLF line break? Usually a LF ("\n")
should be more than enough. You could also define your line break as a
constant, which is what I usually prefer in such cases:

define('CRLF', "\r\n");

2) There are many ways to write an XML string with embedded variables.
Personally I find the one you use the least readable because of all the
concatenations and the escaping. Even with syntax highlighting it still
looks ugly to me. SGML/XML also allow single quotes around attribute
values, so some possible alternatives would be for example:

$xml .= '<?xml version="1.0" encoding="UTF-8"?>'.CRLF;

or simply

$xml .= "<?xml version='1.0' encoding='UTF-8'?>\n";

sprintf() can also be very helpful for more complex strings with many
embedded variables or expressions.

3) You could also have a look at the DOM extension to create your XML.

Just my 3 cents.


Re: XML encoding

Quoted text here. Click to load it

If I recall correctly, End-Of-Line-characters are always LineFeeds in  
xml. It is in the XML standard.

Best regards

Re: XML encoding

.oO(Dikkie Dik)

Quoted text here. Click to load it

Yes, it's how an XML processor is supposed to handle the line breaks,
even if they're written as CRLF.

2.11 End-of-Line Handling

But of course you could also write all your XML without any line breaks
on a single line.


Re: XML encoding

Michael Fesser emailed this:
Quoted text here. Click to load it

But not very human readable. Hence my line breaks.

Cheers again.

Re: XML encoding

Dikkie Dik emailed this:
Quoted text here. Click to load it

Thanks Dikkie.

Re: XML encoding

Thanks for the info and hints. I'm new to PHP so your comments are helpful.

Quoted text here. Click to load it

I've been using something similar to define '<br />' but didn't bother  
making a define for one function's benefit.

Quoted text here. Click to load it

Both are much clearer than mine, I'll use the second and modify my code.  
Didn't know there was a .= operator, useful.

Quoted text here. Click to load it

I remember that from my C days, many moons ago.

Quoted text here. Click to load it

Will do.

Thanks a lot.

Site Timeline