Click here to get back home

FAQ 5.3 How do I count the number of lines in a file?

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
FAQ 5.3 How do I count the number of lines in a file? PerlFAQ Server 03-05-2008
Posted by sheinrich on March 8, 2008, 7:46 am
Please log in for more thread options
>
> --------------------------------------------------------------------
>
> 5.3: How do I count the number of lines in a file?
>
> One fairly efficient way is to count newlines in the file. The following
> program uses a feature of tr///, as documented in perlop. If your text
> file doesn't end with a newline, then it's not really a proper text
> file, so this may report one fewer line than you expect.
>
> $lines = 0;
> open(FILE, $filename) or die "Can't open `$filename': $!";
> while (sysread FILE, $buffer, 4096) {
> $lines += ($buffer =~ tr/\n//);
> }
> close FILE;
>
> This assumes no funny games with newline translations.
>
> --------------------------------------------------------------------
>
Doesn't it maybe make a difference (regarding performance) whether I'm
writing
$lines += ($buffer =~ tr/\n/\n/);
?
More often than never I find myself musing if a substitution with a
replacement of same size could be more performant than others. I think
that this might prevent costly block-shiftings on large data.

Also, tr/// (and s///) might be possibly optimized by shifting to a
'count-mode' if it is detected that the pattern and the replacement
both are hardcoded and are the same.
Insights?

steffen

Posted by Ben Morrow on March 8, 2008, 8:54 am
Please log in for more thread options

Quoth sheinrich@my-deja.com:
> >
> > --------------------------------------------------------------------
> >
> > 5.3: How do I count the number of lines in a file?
> >
> > One fairly efficient way is to count newlines in the file. The following
> > program uses a feature of tr///, as documented in perlop. If your text
> > file doesn't end with a newline, then it's not really a proper text
> > file, so this may report one fewer line than you expect.
> >
> > $lines = 0;
> > open(FILE, $filename) or die "Can't open `$filename': $!";
> > while (sysread FILE, $buffer, 4096) {
> > $lines += ($buffer =~ tr/\n//);
> > }
> > close FILE;
> >
> > This assumes no funny games with newline translations.
> >
> > --------------------------------------------------------------------
> >
> Doesn't it maybe make a difference (regarding performance) whether I'm
> writing
> $lines += ($buffer =~ tr/\n/\n/);
> ?

Nope. That does exactly the same thing.

> More often than never I find myself musing if a substitution with a
> replacement of same size could be more performant than others. I think
> that this might prevent costly block-shiftings on large data.
>
> Also, tr/// (and s///) might be possibly optimized by shifting to a
> 'count-mode' if it is detected that the pattern and the replacement
> both are hardcoded and are the same.

tr/// with empty replacement and without /d and m// in scalar context
(with no capturing parens) are the operators you are looking for. Both
just count, and don't replace anything.

Ben


Posted by sheinrich on March 8, 2008, 11:21 am
Please log in for more thread options
> Quoth sheinr...@my-deja.com:
>
>
>
>
> > > --------------------------------------------------------------------
>
> > > 5.3: How do I count the number of lines in a file?
>
> > > One fairly efficient way is to count newlines in the file. The
following
> > > program uses a feature of tr///, as documented in perlop. If your text
> > > file doesn't end with a newline, then it's not really a proper text
> > > file, so this may report one fewer line than you expect.
>
> > > $lines = 0;
> > > open(FILE, $filename) or die "Can't open `$filename': $!";
> > > while (sysread FILE, $buffer, 4096) {
> > > $lines += ($buffer =~ tr/\n//);
> > > }
> > > close FILE;
>
> > > This assumes no funny games with newline translations.
>
> > > --------------------------------------------------------------------
>
> > Doesn't it maybe make a difference (regarding performance) whether I'm
> > writing
> > $lines += ($buffer =~ tr/\n/\n/);
> > ?
>
> Nope. That does exactly the same thing.
>
> > More often than never I find myself musing if a substitution with a
> > replacement of same size could be more performant than others. I think
> > that this might prevent costly block-shiftings on large data.
>
> > Also, tr/// (and s///) might be possibly optimized by shifting to a
> > 'count-mode' if it is detected that the pattern and the replacement
> > both are hardcoded and are the same.
>
> tr/// with empty replacement and without /d and m// in scalar context
> (with no capturing parens) are the operators you are looking for. Both
> just count, and don't replace anything.
>
> Ben

Indeed, I never realized that.

From perlfaq4, there is also a suggestion to count multi-byte
patterns:
<cite>
How can I count the number of occurrences of a substring within a
string?
There are a number of ways, with varying efficiency. If you want a
count
of a certain single character (X) within a string, you can use the
"tr///" function like so:

         $string = "ThisXlineXhasXsomeXx'sXinXit";
         $count = ($string =~ tr/X//);
         print "There are $count X characters in the string";

This is fine if you are just looking for a single character.
However, if
you are trying to count multiple character substrings within a
larger
string, "tr///" won't work. What you can do is wrap a while() loop
around a global pattern match. For example, let's count negative
integers:

         $string = "-9 55 48 -2 23 -76 4 14 -44";
         while ($string =~ /-\d+/g) { $count++ }
         print "There are $count negative numbers in the string";

Another version uses a global match in list context, then assigns
the
result to a scalar, producing a count of the number of matches.

                 $count = () = $string =~ /-\d+/g;
</cite>

Thank you,

Steffen

Similar ThreadsPosted
FAQ 5.3: How do I count the number of lines in a file? December 4, 2004, 6:03 am
FAQ 5.3: How do I count the number of lines in a file? January 6, 2005, 12:03 am
FAQ 5.3 How do I count the number of lines in a file? January 30, 2005, 12:03 pm
FAQ 5.3 How do I count the number of lines in a file? May 13, 2005, 11:03 pm
FAQ 5.3 How do I count the number of lines in a file? July 29, 2005, 10:03 am
FAQ 5.3 How do I count the number of lines in a file? September 12, 2005, 4:03 am
FAQ 5.3 How do I count the number of lines in a file? November 14, 2005, 11:03 pm
FAQ 5.3 How do I count the number of lines in a file? April 23, 2006, 3:03 pm
FAQ 5.3 How do I count the number of lines in a file? August 8, 2006, 3:03 pm
FAQ 5.3 How do I count the number of lines in a file? November 17, 2006, 9:03 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap