|
Posted by Dave Saville on June 12, 2008, 5:12 am
Please log in for more thread options wrote:
>
> EP> *SKIP*
>
> >> Sorry to jump in with another question but I have a very similar
> >> problem. I am processing a consolidated apache2 logfile. I have
> >> multiple virtual hosts. All I care about are the site, the page
> >> served, a counter and the date.
>
> EP> Piece of advice. The next time you'll would like to I<jump in> consider
> EP> stoling the thread. Otherwise your question can be left unanswered.
> EP> Because it wasn't seen.
>
True, but the problem looked so similar.
> EP> *SKIP*
> >> I have tried various ideas I found by google but they all tend to be
> >> similar to this
>
> EP> Forget B<google>, use B<perldoc> instead.
>
> >> sub by_count
> >> {
> >> $urls[0] <=> $urls[0] or $a cmp $b;
> >> }
>
> EP> Hopefully Uri won't see that. I<$a> and I<$b> are special. However
> EP> only in context of B<sort>.
>
> i did. my eyes are bleeding!
>
Sorry I don't understand what you are getting at - apart from an in
joke.
> >> I would be grateful for any pointers.
>
> EP> If I guessed your problem right way, than:
>
> EP> print "$site $_ $urls[0] $urls[1]\n"
> EP> foreach(
> EP> sort { $urls[0] <=> $urls[0]; }
> EP> sort { $urls[1] <=> $urls[1]; }
> EP> keys %});
>
> are you (or the OP) trying to do a multilevel sort? it looks like yours
> will work but it is unusual to do two sort passes. and it relies on the
> sort to be stable (meaning equal keys stay in the same ordering post
> sort). perl now uses a stable sort but earlier versions didn't. it is
> not something you should depend upon.
<snip>
No not multi level here, just two ways of presenting the data
depending on which $site it came from. Thanks for the help guys, but
they are only variations on what I had tried with no luck. However, I
have discovered that here (OS/2) there is a bug in perl (5.8.2). I
don't know yet if it is a bug in perl or the port. I suspect the
latter, but
foreach my $url (sort [0] <=> $urls[0] }
keys %})
and
foreach my $url (sort by_value keys %})
sub by_value
{
$urls[0] <=> $urls[0];
}
Give different results. The first works correctly and the second for
some reason yet to be determined gets the *wrong* value of $site. I
stuck a print $site in the subroutine. That is where all the errors
came from, it was trying to compare site A's urls against site B's -
No wonder there where a lot of errors :-)
I was going to run my test case on my Solaris box but the darn thing
decided to trash its hard drive :-(
Oh, and the date is text and sortable - YYYY/MM/DD.
--
Regards
Dave Saville
NB Remove nospam. for good email address
|