domxml - parsing an xml file once per session rather than every time

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
I am working on an embedded system.  The entire configuration for the  
system is stored in an XML file, which is pretty long.

It takes about 3 seconds to open the file using domxml_open_file.

Breaking the file into smaller files is not possible; a single XML file  
is a part of the design requirement.

Right now we're opening and freeing the file every time any data is  
requested from the file, which is quite often.  This means that the user  
ends up waiting about 5 seconds total before the page is generated.

It would be nice if we could open the file once per session, and keep  
the file open throughout the session, rereading only when the file on  
disk changes.

Is this possible?



Re: domxml - parsing an xml file once per session rather than every time

CptDondo wrote:

Quoted text here. Click to load it

Hi Yan,

Yes, this is possible, but I seriously doubt it will increase performance.
Say you have a 1 MB file.
You store it into the session like $_SESSION["hugestructure"] = <yourfile>.

When the script end, the session is serialized and written to a sessionfile  
in some sessiondirectory.

Next time that session is needed, the whole file must be read back into  
memory in $_SESSION["hugestructure"], even if you don't use it.

I expect the overhead of serializing the file and safing it to disk takes  
even longer than just opening it when needed.

Personally I would rethink the design of the application. Do you really need  
such a huge file so often? Which information is used? Can you translate the  
file to a few tables in a database and just query what you need when you  
need it?

If this is no option for you, you might try a 'shared memory' approach in  
Here is more info:

I never did such a thing, so my advise ends here. :-)

Best of luck.
Erwin Moller

Quoted text here. Click to load it

Re: domxml - parsing an xml file once per session rather than every time

Erwin Moller wrote:
Quoted text here. Click to load it

Well, I was hoping for some sort of magical server-side caching where I  
could stash the $dom.

Passing it back and forth is not practical; we have a 11MBps network  
that serves 600 nodes; so bandwidth is a *huge* concern.  We're already  
using a compressed protocol to communicate and gateways and relays to  
minimize the impact of broadcasts.

Quoted text here. Click to load it

The XML file itself is 300K; not really huge by modern standards, and on  
a normal server it would not be an issue. Alas, I am working with a  
200MHz embedded box with 32 MB RAM, so we're trying to squeeze as much  
as we can out of it.

We open the file once per page load to read configuration and data that  
pre-fills a form; and then possibly save any changes that the user has made.

A single human readable file with all of the information is a *huge*  
benefit for our customers; something we're not likely to give up.  I  
think all in all I'd rather have this particular customer wait a bit.

Quoted text here. Click to load it

I may follow up on that in V2.  :-)  It looks interesting; I don't know  
if we can shove a $dom in there and retrieve it.


Re: domxml - parsing an xml file once per session rather than every time

On Wed, 06 Dec 2006 09:09:02 -0800, CptDondo wrote:
Quoted text here. Click to load it

If you're feeling particularly adventurous, another potential approach
would be to write a PHP extension (in C, that is) that loads and parses
the XML file on module initialisation (effectively when the Web server
starts, assuming I'm right in remembering that you were using FastCGI),
stashing the parsed DOM somewhere, then creating a function or class to
access it from within each script.

What I don't know is exactly how you'd go about the XML parsing, since
I've never developed an extension that has to do that. I suspect you'd
have to use libxml2 directly for at least the initial parsing, and you'd
probably have to persist a libxml2 xmlDoc between requests rather than a
PHP object. The hard part's likely to be writing the function for
script-level access, since you're going to need to hook into ext/domxml to
create the DomDocument object for the PHP script to use.

It sounds like overkill to me, honestly, but given the power limitations
you're dealing with, it may be an option to look at if you need to save
every last cycle.


Adam Harvey

To e-mail: don't make an example out of me!

Site Timeline