Reducing memory consumption

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
    I'm using PHP to run a CLI application. It's a script run by cron that
parses some HTML files (with DOM XML), and I ended up using PHP to integrate with
the rest of the code that already runs the website.

    The problem is: it's eating more memory than a black hole. It eats the
current limit of 256MB set in php.ini, in an application that would hardly
consume 4MB if written in C. I don't care if this application takes much longer
to run than it would in C, but eating that much memory is not acceptable.

    So, my question is, how do I find out what is eating that much memory?
I'm suspicious of memory leaks, or very stupid garbage collection. Any help?

Bruno Barberi Gnecco <>
This was the most unkindest cut of all.
        -- William Shakespeare, "Julius Caesar"

Re: Reducing memory consumption wrote:
Quoted text here. Click to load it

Different sort of problem, but I struggled with a long-running script that
leaked a bit of memory on each loop, and after a few days/weeks was using
too much memory. What I had to do was "instrument" the PHP script,
inserting calls to report current memory usage at frequent intervals. There
are two ways to do this. memory_get_usage() might be available (depends
on how PHP was built), but I think it only reports memory used by PHP
allocators. (Didn't help me, because the leak turned out to be in a loaded
extension.) The other way (on Linux, for example) is to look at
/proc/meminfo for total memory usage. Do this often enough in your script,
and you should be able to narrow down where the memory is being lost.

Re: Reducing memory consumption

ljb wrote:
Quoted text here. Click to load it

    Thanks. I did this, and the memory is apparently being lost by
the INSERT query. prepare/execute leak *a lot* of memory, while a simple
query() still leaks, but much less.

    Thinking it might be related to this bug: (memory leak when doing loads of
INSERT's and there are duplicate key errors), I added a SELECT to avoid
the errors. This had an extraordinary effect: memory consumption started
to decrease, and soon I was using negative memory. When the program ended,
memory_get_usage() was returning '-9415900'.

    Since this universe doesn't allow negative memory usage, I
wonder WTF is going on. I'm using MDB2 as DB frontend, BTW, which
may be the culprit here. What frees more memory than it allocated is
a call to query('SELECT ...'). Removing this call leads back to the
endless growing memory problem.

    Any ideas to find out what is causing this: mysql, php,

Bruno Barberi Gnecco <>
Cropp's Law:
    The amount of work done varies inversly
    with the time spent in the office.

Re: Reducing memory consumption

Quoted text here. Click to load it

This effect is due to the original "black hole"
Your code has passed through a worm-hole into a parallell process which
assigns memory inversly :)

Re: Reducing memory consumption

Vince Morgan wrote:
Quoted text here. Click to load it

    Of course! You just gave me the breakthrough I was looking for
to get my Nobel :) Or perhaps I can create my own 'infinite storage'
service, only the data will be stored in a write-only location ;)

Bruno Barberi Gnecco <>

Re: Reducing memory consumption

Quoted text here. Click to load it

The infinite storage service is true genius Bruno, can't see how I didn't
see it myself.  Perhaps it may be possible to make this service read/write
with some considerable effort.  However, you would have to manage the
downloads carefully.  Allowing too many would quickly exhaust the available
If you should manage to achieve this I would expect a share of the financial
proceeds of course.  However, you are welcome to the glory.  I always get a
large volcanic zit in the center of my forehead an hour or so before
recieving prestigious international awards ;)

Re: Reducing memory consumption

Bruno Barberi Gnecco wrote:
Quoted text here. Click to load it

Without knowing what your application does, it's impossible to tell.

But I know I've handled some very large files (i.e. log files, XML,  
etc.) in 8MB of memory without any problems.

I've even parsed a (rather poorly written) html page that's > 10Mb and  
still not run out of memory at 8MB.

Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.

Re: Reducing memory consumption

Jerry Stuckle wrote:
Quoted text here. Click to load it

    Exactly, that's why I'm puzzled by this. What the application
does is very simple: it opens an IMAP connection, and for each email,
it parses the HTML body to extract some information out of it, and
saves this information into a database. THe HTML files are less than
1MB, and number of messages read is small (< 20). Since the information
is parsed by pieces, the memory used by it should peak at 10kb or 20kb.

    The parsing is done using DOM (not DOM XML, as I wrote before,
my mistake) and xpath queries. The parsing is done in a separate method,
so I was expecting that any memory allocated for parsing a message
would be freed before the next one is parsed. I'm using php 5.

    What did you use to parse your page? DOM? DOM XML? Something

    Any tips? Thanks!

Bruno Barberi Gnecco <>
It takes a smart husband to have the last word and not use it.

Re: Reducing memory consumption

Bruno Barberi Gnecco wrote:
Quoted text here. Click to load it

No, I wasn't using DOM on this one - just stripping out the tags.

However, the DOM does a lot of things behind the scenes.  For instance,  
when you call DOMDocument::getElementsByTagName(), DOM will allocate an  
entire nodelist.  And this nodelist will contain everything under each  
node in the list.

So if you do something like:

$doc = new DOMDocument;

You'll get the entire document into the DOMDocument.  Now, if you:

   $l1 = $doc->getElementsByTagName('level1');

You'll get a nodelist with all the level 1 tags.  But each entry in the  
nodelist will contain all of the elements under it - level 2, level 3,  
and so on.

So if you have a layout such as:

     <level2 />
        <level3 />

Your DOMDocument will contain all the items - but so will the nodelist.  
  Effectively you've about doubled the amount of memory being required.

If you now get the level2's, you'll have two entries - one which is just  
a level2, but the second one will have level2 and level3.

So you can see memory usage can increase a lot, especially if you have a  
lot of lower levels.

And BTW - depending on the amount of whitespace in your XML file, even  
the DOMDocument object may take more or less memory than the file itself.

The problem here is the DOMNodeList doesn't have a method to remove an  
entry from the list.  I don't know what


would do - but I don't think I'd try it.  I suspect the DOMNodeList  
would have problems with it.

The only thing I can recommend is to unset the nodelists themselves as  
soon as possible.  That should free up the memory used by them.

Of course, there's another possibility here, also - that there's a  
memory leak in it.  I haven't seen one - but then I can't say as I've  
done anything as big as you are, and I haven't looked for problems.  And  
a search of the PHP bugs database doesn't show anything being reported.

Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.

Re: Reducing memory consumption

Jerry Stuckle wrote:
Quoted text here. Click to load it

    As I mentioned in the other post, I found out that it isn't
DOM eating all the memory, but the SQL queries. Apparently I ran into
two bugs:

1) prepare/Execute has a memory leak. This could be happening in MDB2
or in PHP itself, perhaps in the mysqli extension. This happens
consistently eventually exhausts memory.

2) there is a problem in mysqli queries that seem to confuse the
allocated memory counting, but it's not a serious bug (i.e., it
doesn't crash. I successfully completed a long run of my script,
which added some 27k entries to the database. Despite the memory
becoming negative, it didn't crash, and apparently there was no
corruption or unexpected results (not that I could see so far).

    In this successful #2 run, what I did was get the mysqli
connection from mdb2 (with getConnection()) and run mysqli_query()
directly (and OMG, how slow mdb2 is!). So this problem isn't in
MDB2: it's either in PHP itself or in the mysqli extension. My
*guess* is that PHP memory system is counting something wrong
when it allocates memory. I watched top(1) while the script
ran, and it didn't consume a lot of memory (10-16 MB), which is
a little more than I'd expect, but I was including MDB2 and
other stuff. If I didn't exhaust the memory first, I'd have
never noticed that the memory count was negative.

    I'm still at a loss of whom should I report this bug
to. Any suggestions?

Bruno Barberi Gnecco <>
It's always darkest just before it gets pitch black.

Re: Reducing memory consumption

Bruno Barberi Gnecco wrote:
Quoted text here. Click to load it

Yes, I read your other posts after I responded.

PHP bugs are managed at .  Pear bugs are at

I'm not sure which one it would be, either.  But you'll need to create  
the problem with a *small* test case so they can duplicate it.  
Otherwise they don't stand much of a chance of finding the bug.

Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.

Site Timeline