parse huge XML file

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
Hi all,

I need help for parsing a huge XML file with some ISP limits (I can't change  
ISP, so this isn't a solution).

The ISP limits the scripts to 16MB and 10 seconds. Those params can't be  
changed (no set function available either no config available).
So my file contains an export from different customers. Those export can be  
quite large (20MB) and contains the full article description and the  
attached pictures.

For now I parse the XML files, record the values in my DB and save pictures  
on a server.
Obviously, due to the size of the actual XML and to the limits of the ISP, I  
often run on Time Exceed error.

I've many articles in the XML export, each having 0 to 10 pictures. My idea  
is to "refresh" the page after each article has been created in the DB.
My question are:
- how can I manage the refresh after each article ?
- how can I read from the file a block big enough to avoid to have only part  
of the article description ?
I'm not so confortable with XML parsing. It's there any PHP code that will  
put everything in an array in order to manage this using structures or  
whatever ?

Thanks for help.


Re: parse huge XML file


Regarding the time limitations, you could try a driver script that
calls another script to perform an action on each file. Something like:

foreach (glob(dirname(__FILE__) . '/*.xml) as $file) {
    $url = 'http://' . $_SERVER['HTTP_HOST'] .
'/admin/loadArticle.php?file=' .
      urlencode($file) . ';'.

The hope in this case is that the time spent waiting in readfile()
won't actually count toward the script time limit.

As for the memory limits, you might try the expat-based XML functions
at which are event-driven. There's also the DOM extension
at which parses XML into a DOMDocument structure.

Bob Bedford wrote:
Quoted text here. Click to load it

Site Timeline