simplexml and CDATA

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

Can someone please enlighten me on how to preserve the <![CDATA[]]>
element when parsing an xml file or string with simplexml.  I'm using
libxml 2.6.16 and php 5.1.4.

I tried a few variations, I found at

  $xml = simplexml_load_string($newsMLString, 'SimpleXMLElement',
  echo $xml->asXML();

  $xml = simplexml_load_string($newsMLString);
  echo $xml->asXML();

  $xml = simplexml_load_file('include/newsMLSpecs.xml');
  echo $xml->asXML();

  $xml = simplexml_load_file('include/newsMLSpecs.xml',
'SimpleXMLElement', LIBXML_NOCDATA);
  echo $xml->asXML();

In all cases the CDATA element is stripped out.  I don't understand why
an xml parser would do that by default?  It could be my setup but not

I'm using the NewsML spec found here:

Any ideas?



Re: simplexml and CDATA wrote:

Quoted text here. Click to load it

As far as I've been able to find out this is not possible. However, I
don't think there is a *functional* difference between preserving the
CDATA and what simplexml does. In CDATA sections, you can include e.g.
<i>some</i> HTML elements as-is. They, when parsing, will not result in
separate nodes in the resulting DOM. The XML that $xml->asXML()
generates, escapes the reserved characters (<, >, &) instead. As far as
I can tell this results in functionally the same XML.

Of course, it would be nice if simplexml would remember that a given
element originally held CDATA contents and output it as such. Perhaps
you can file an enhancement request for this.


Re: simplexml and CDATA

Gertjan Klein wrote:
Quoted text here. Click to load it

I think the answer is use DOM if your needs are not simple. ;)
Which should be available if SimpleXML is.

Thanks Gertjan.

Site Timeline