gz compression rates with custom buffer callback

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!


First, thanks to those who'll read me 'til the end, I know my code can  
seem a bit messed up, that they may be pretty obvious solutions, that I  
may speak some bad English; but optimization madness bring me here. I'm  
sure you'll understand :).

Basically, what I want to do is a classic output-buffer callback  
function that gzencode() the buffer. There's a native function for that,  
I know, but I want a little more: compression stats (and maybe even more  

Here's the idea: (don't scream, I'll explain)

# gz stats (buffer callback)
function gz_cmp($buffer) {
     $buffer = preg_replace_callback(
             'if( $status[1] ) {
                 # ' . ($gz_activated = true) . '
                 # ' . ($s_size = strlen($buffer)) . '
                 # ' . ($c_size = strlen(gzencode($buffer, 9))) . '

                 return ' . round((100 - $c_size / $s_size * 100), 1) .  
' . '%';

     if( $gz_activated ) {
         header('Content-Encoding: gzip');
         $buffer = gzencode($buffer);

     return $buffer;

gz_cmp() = GZ Compression callback function
$status = array('$gz-stats=1', '1')
-> $status[1] = GZ Compression availability (1 or 0)
$gz_activated = $status[1], as a flag for the function's ending
$s_size = Plain-text buffer (document) size
$c_size = GZ compressed buffer (document) size
[weird formula] = Compression rate (in %)


Somewhere in the page is output $gz-stats=1$ or $gz-stats=0$, depending  
on whether GZ comp is used or not (through a constant and a few checks  
as GZ module availability and browser's Accept-Encoding HTTP header;  
well, whatever). Of course, 1=Enabled and 0=Disabled.

Now, the output buffer ends, place to the callback function: gz_cmp().  
The first thing that comes into your mind might be that we have to  
search for $gz-stats=x$ and THEN replace it by the stats, actually  
encoding the doc, sending Content-Encoding header,.. that stuff; OR  
simply return back the buffer unchanged if $gz-stats=0$.

But, this means two regexp searches in the whole doc in case GZ is  
activated: one for the check of $gz-stats=x$, one for the replacement.
-> I want one.

Thus, I thought. I ain't genius and that's probably why it isn't really  
working as expected, but here's the idea: to directly make the  
replacement using preg_replace_callback() which, as it name implies,  
calls some function back too. The only argument passed to the callback  
function is an array of matches, with the first value for the whole  
found pattern and the rest for each parenthesis. I got only one  
parenthesis which should only be 1 or 0 (GZ activated or not), at the  
second value of the array.

I decided to make a lambda callback function for the replacement (or  
'anonymous function') with create_function(). This permits me to get the  
buffer sizes (original and compressed) from outside the replace callback  
function (as I can NOT pass them as parameters). The other advantage I  
thought this system would give me was that I could set an external flag  
($gz_activated) from within the lambda callback in order to ACTUALLY  
encode the doc AFTER having replaced $gz-stats=1$ by the compression  
stats (which is a simple rate in %, by the way).

Why so much complications? I don't want to use globals. I know you  
thought of it ;). Portability purposes only.

This stuff seems to work great, as you can see, I comment some lines in  
the lambda callback to set the flag and compute the lengths outside the  
string. Then I return the stats (which, by the way, turns around 80%, GZ  
rocks!) to the preg_replace() function which will replace the  
$gz-stats=1$ with them. Once done, the result is stored in $buffer  
(which is actually updated). The flag is set, so I can now send the HTTP  
header to tell the browser we're gonna send some encoded stuff, and then  
actually encode it.

The $buffer, now modified and encoded, is eventually returned, the  
output buffer is flushed and here we go.

\o/. Or not..
Now I try to set $gz-stats=0$. The stats aren't displayed, as expected,  
but after having sniffed the headers, I found the content was still  
encoded in GZ. For whatever reason $gz-stats$ as been set to 0 by the  
main script, so we don't want it.

The reason?
Apparently, PHP parses all the lambda function code twice. First to  
'decode' it (don't forget it's just a string), and then to execute it  
properly. Well, it's my guess anyway.

As you can imagine, it's impossible (at least I think) to pass custom  
parameters to the OB callback, thus I found myself screwed. I'm now  
asking you: would you imagine any solution to
- get the flag out of the lambda func ONLY when expected, OR
- get the GZ Availability value into the OB callback by any other way..

..knowing that I can't bear with globals, and that I'd already forget to  
reload the page after storing the value in any dead mem, if I were you ^^'.

Is that some kind of challenge, or am I just blind? I think I got into  
something that maybe isn't of my level -.-'

Thanks for all !


PS If you have a totally different solution, I wouldn't mind throwing  
all of this away; it always hurts, but I think I got used to. =P.

Site Timeline