Do you have a question? Post it now! No Registration Necessary. Now with pictures!
- Posted on
- Joshua Beall
January 27, 2005, 9:30 am
rate this thread
I have been using the SAX library in PHP to parse XHTML documents, and one
thing I have noted is that the <!DOCTYPE> line is ignored.
I am wondering is there any way to get the <!DOCTYPE> using the SAX
functions in PHP? I am looking over the manual, but nothing is jumping out
I have thought about loading it using DOM, but I'd rather not consume the
memory if possible. And another option would be to just using simple string
parsing methods to pull it out of the original document, but again I am
hoping that I would be able to do it somehow using the SAX functions... any
chance of this?
Re: Getting the from an XHTML document that was parsedusing SAX - possible?
The default handler handles "the XML declaration, document type
declaration, entities or other data for which no other handler exists".
Sometimes reading the function descriptions helps, y'know ;)
Re: Getting the from an XHTML document that was parsed using SAX - possible?
Unfortunately this function still ignores the <!DOCTYPE> element, despite
claims to the contrary in the manual. Works fine for comments and entities,
though. Might be a PHP bug; I'm running 5.0.3.
I've resorted to some simple string manipulation to get the pieces that SAX
won't let me at. When I have the time, though, I'll work up a short code
example and post a bug report to php.net.