[RFC] Log::Dispatch::FileX

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View


This is a file output handler for Log::Dispatch that is more suitable for use in
multitasking environments.

It contains features I miss in both Log::Dispatch::File (doesn't do locking) and
Log::Dispatch::File::Locked (locks persist
for as long as the file is open). In lack a naming fantasy, I temporarily called
it Log::Dispatch::FileX.
This class is almost compatible with Log::Dispatch::File. The most important new
features are these:

It can perform locking that is only active during writes (1st a non-blocking
flock is attempted and if that fails, then a
blocking flock in a short timeout wrapper).

It has a close_after_modperl_request option as an alternative to leaving a file
open forever (= can't be rolled-over) or
closing it after each write (= slow).

Below is the POD documentation. An installable source package is available here:

As this is a RFC, I'ld appreciate some feedback on possible naming (of the
module that is) and opinions because I don't want
to spam CPAN with redundent stuff. I've emailed the Log::Dispatch author too and
I'm waiting for a reply.

-Craig Manley

Module Documentation

     Log::Dispatch::FileX - Object for logging to file.

       use Log::Dispatch::FileX;

       my $output = Log::Dispatch::FileX->new(
                                              name      => 'test',
                                              min_level => 'info',
                                              filename  => 'logfile.txt',

       $output->log( level => 'emerg', message => "Time to die." );

     This module provides an object for logging to files under the
     Log::Dispatch::* system. Log messages are written using the flock file
     locking mechanism per default on a per write basis which means this
     module is suitable for sharing a log file in a multitasking environment.

     * new(%p)
         This method takes a hash of parameters. The following options are

         * name ($)
                 The name of the object (not the filename!). Required.

         * min_level ($)
                 The minimum logging level this object will accept. See the
                 Log::Dispatch documentation on Log Levels for more
                 information. Required.

         * max_level ($)
                 The maximum logging level this obejct will accept. See the
                 Log::Dispatch documentation on Log Levels for more
                 information. This is not required. By default the maximum is
                 the highest possible level (which means functionally that
                 the object has no maximum).

         * filename ($)
                 The filename to be opened for appending.

         * mode ($)
                 The mode the file should be opened with. Valid options are
                 '>' (write) and '>>' (append). The default is '>>' (append).

         * perms ($)
                 If the file does not already exist, the permissions that it
                 should be created with. Optional. The argument passed must
                 be a valid octal value, such as 0600. It is affected by the
                 current or given umask.

         * umask ($)
                 The optional umask to use when the file is created for the
                 first time.

         * flock ($)
                 Whether or not log writes should be wrapped in a flock.
                 Defaults to true. If true, then for each logged message, a
                 non-blocking flock is attempted first, and if that fails,
                 then a blocking flock is attemped with a timeout.

         * close_after_write ($)
                 Whether or not the file should be closed after each write.
                 This defaults to false. If set to true, then the mode will
                 aways be append, so that the file is not re-written for each
                 new message.

                 Note: opening and closing a file for each write is a
                 relatively slow process (especially on windoze systems) as
                 demonstrated in the performance benchmarks.

         * close_after_modperl_request ($)
                 Only applicable for code running in a mod_perl (1 or 2)
                 environment and defaults to false. Set this to true if the
                 file should be closed after each mod_perl request which is
                 useful if you're using a persistent Log::Dispatch object and
                 intend to periodically roll your log files without having to
                 restart your web server each time.

         * autoflush ($)
                 Whether or not the file should be autoflushed. This defaults
                 to true. If flock is true, then flushing always occurs no
                 matter what this is set to.

         * callbacks( \& or [ \&, \&, ... ] )
                 This parameter may be a single subroutine reference or an
                 array reference of subroutine references. These callbacks
                 will be called in the order they are given and passed a hash
                 containing the following keys:

                  ( message => $log_message, level => $log_level )

                 The callbacks are expected to modify the message and then
                 return a single scalar containing that modified message.
                 These callbacks will be called when either the "log" or
                 "log_to" methods are called and will only be applied to a
                 given message once.

     * log_message( message => $ )
         Sends a message to the appropriate output. Generally this shouldn't
         be called directly but should be called through the "log()" method
         (in Log::Dispatch::Output).

     FreeBSD 6.1 with a single Intel(R) Xeon(TM) CPU 3.60GHz
          Measuring 10000 logs of using defaults...
                  Log::Dispatch::FileX... 0.739 seconds   (avg 0.00007)
                  Log::Dispatch::File...  0.622 seconds   (avg 0.00006)
          Measuring 10000 logs of using autoflush=0, flock=0...
                  Log::Dispatch::FileX... 0.575 seconds   (avg 0.00006)
                  Log::Dispatch::File...  0.574 seconds   (avg 0.00006)
          Measuring 10000 logs of using autoflush=1, flock=0...
                  Log::Dispatch::FileX... 0.618 seconds   (avg 0.00006)
                  Log::Dispatch::File...  0.623 seconds   (avg 0.00006)
          Measuring 10000 logs of using flock=1...
                  Log::Dispatch::FileX... 0.739 seconds   (avg 0.00007)

          Measuring 10000 logs of using close_after_write=1, flock=0...
                  Log::Dispatch::FileX... 1.080 seconds   (avg 0.00011)
                  Log::Dispatch::File...  1.035 seconds   (avg 0.00010)
          Measuring 10000 logs of using close_after_modperl_request=1, flock=1...
                  Log::Dispatch::FileX... 0.768 seconds  (avg 0.00008)

     Windoze XP with a Pentium CPU 3.0GHz
          Measuring 10000 logs of using defaults...
                  Log::Dispatch::FileX... 1.235 seconds   (avg 0.00012)
                  Log::Dispatch::File...  1.047 seconds   (avg 0.00010)
          Measuring 10000 logs of using autoflush=0, flock=0...
                  Log::Dispatch::FileX... 0.875 seconds   (avg 0.00009)
                  Log::Dispatch::File...  0.907 seconds   (avg 0.00009)
          Measuring 10000 logs of using autoflush=1, flock=0...
                  Log::Dispatch::FileX... 1.063 seconds   (avg 0.00011)
                  Log::Dispatch::File...  1.047 seconds   (avg 0.00010)
          Measuring 10000 logs of using flock=1...
                  Log::Dispatch::FileX... 1.251 seconds   (avg 0.00013)

          Measuring 10000 logs of using close_after_write=1, flock=0...
                  Log::Dispatch::FileX... 74.128 seconds  (avg 0.00741)
                  Log::Dispatch::File...  79.660 seconds  (avg 0.00797)

         Note how rediculously slow Windoze is when close_after_write=1 is


     Craig Manley

     Copyright (C) 2007 Craig Manley This library is free software; you can
     redistribute it and/or modify it under the same terms as Perl itself.

     Dave Rolsky, author of the Log::Dispatch suite including
     Log::Dispatch::File on which this module is based.

Sending output to Spreadsheet

Hi, New to Perl, using ActiveState 5.8, Win XP

I'm not sure if this is the best ng...should misc be my first
port of call?

I am trying to adapt Brent Hughes rget-links.pl original code by
collecting discovered web links into a spreadsheet for later use.
BTW, Any errors are due to me, not Brent!

The code runs OK and prints out the web links found into the command window.

I have looked at the Spreadsheet::SimpleExcel module but I cannot work
out the syntax to get the accumulated links into my_List.xls file.

Any suggestions will be appreciated!
Cheers, Peter


use warnings;
use strict;

package RGetLinks;

use LWP::UserAgent;
use HTML::LinkExtor;
use URI::URL;
use Getopt::Long;
use Spreadsheet::SimpleExcel;

$| = 1;

# global data for this program
my $depth;
my %files;

# command line options
my $opt_depth = 4;

# retrieve command line options
my $options = GetOptions ("depth=i" => $opt_depth);  # numeric

    my $url = 'http://somesite/';

# abort if the options are improperly formatted
if(!defined $url){ usage(); }

# program enters actual processing at this point

# create a new instance of Excel
my $excel = Spreadsheet::SimpleExcel->new();
# add worksheet
$excel->add_worksheet('Sheet1',{-headers => \@header, -data => \@data});
# print result into a file and handle error
$excel->output_to_file('c:/Documents and Settings/my_List.xls') or die

#  Subroutines

# A routine to get links recursively
sub rgetlinks
    my($url,$maxdepth) = @_;

    # initialize globals
    $depth = 0;
    %files = ();

    # descend

# A helper routine to get links recursively
sub rgetlinkshelper
    my($url,$maxdepth) = @_;

    # return if too deep or already been here
    if($depth >= $maxdepth || defined $files)
        # drop down a level and add the file to the hash
        $depth++; $files = 1;

        # show our current location        
        foreach(1..$depth) {print ' ';}
        print $url, "\n";

        # retrieve all links
        my @links = getlinks($url);

        # recursive step
        foreach(@links){ rgetlinkshelper($_,$maxdepth); }

        # pop up a level # line 101

# A routine to return links from a URL
# Only retrieve links from text/html files.

my @links = ();

sub getlinks
    my($url) = @_;  # for instance
    my $ua = new LWP::UserAgent;
    # Make the parser.  Unfortunately, we don't know the base yet
    # (it might be diffent from $url)
    @links = ();
    my $p = HTML::LinkExtor->new(\&callback);

    # Look at the header to determine what type of document we have
    my $headreq = HTTP::Request->new(HEAD => $url);
    my $headres = $ua->request($headreq);
    my $type    = $headres->header('content-type');

    # only parse the document for links if it is a text or html document
    if(defined $type && $type =~ /text|html/)
        # Request document and parse it as it arrives
        my $getreq = HTTP::Request->new(GET => $url);
        my $getres = $ua->request($getreq, sub{ $p->parse($_[0])});

        # Expand all URLs to absolute ones
        my $base = $getres->base;
        @links = map { $_ = url($_, $base)->abs; } @links;
    # Return the links
    return @links;

# Set up a callback that collects links
sub callback {
    my($tag, %attr) = @_;

    return if $tag ne 'a';  # we only look closer at <a ...>
    push(@links, values %attr);

# A routine to provide instructions
sub usage
    # strip the progname with a regex
    my $progname = $0;
    $progname =~ s/(.*\|.*\/)(.*)/$2/g;

    # show instructions
    print   "\nUsage:\n\t\t",
        $progname, " [args] target-url > output-file\n\n",
        $progname, " --depth=4 http://www.perl.org \n\n"; # depth=3

    print   "Options\n", "=======\n",
        "The maximum depth of links to traverse (default = 3)\n";


Re: Sending output to Spreadsheet

dysgraphia wrote:
Quoted text here. Click to load it
Apologies!...this post got stuck in the wrong thread...should be a new one

Re: [RFC] Log::Dispatch::FileX

Quoted text here. Click to load it

To avoid creating yet another perl module, I'd rather see this
functionality added to Log::Dispatch::File::Locked as an option.

Could you check with Dave if your need fits with his view of
Log::Dispatch::File::Locked ?


Dominique Dumont
"Delivering successful solutions requires giving people what they
need, not what they want." Kurt Bittner

Site Timeline