|
Posted by sheinrich on March 8, 2008, 11:21 am
Please log in for more thread options > Quoth sheinr...@my-deja.com:
>
>
>
>
> > > --------------------------------------------------------------------
>
> > > 5.3: How do I count the number of lines in a file?
>
> > > One fairly efficient way is to count newlines in the file. The
following
> > > program uses a feature of tr///, as documented in perlop. If your text
> > > file doesn't end with a newline, then it's not really a proper text
> > > file, so this may report one fewer line than you expect.
>
> > > $lines = 0;
> > > open(FILE, $filename) or die "Can't open `$filename': $!";
> > > while (sysread FILE, $buffer, 4096) {
> > > $lines += ($buffer =~ tr/\n//);
> > > }
> > > close FILE;
>
> > > This assumes no funny games with newline translations.
>
> > > --------------------------------------------------------------------
>
> > Doesn't it maybe make a difference (regarding performance) whether I'm
> > writing
> > $lines += ($buffer =~ tr/\n/\n/);
> > ?
>
> Nope. That does exactly the same thing.
>
> > More often than never I find myself musing if a substitution with a
> > replacement of same size could be more performant than others. I think
> > that this might prevent costly block-shiftings on large data.
>
> > Also, tr/// (and s///) might be possibly optimized by shifting to a
> > 'count-mode' if it is detected that the pattern and the replacement
> > both are hardcoded and are the same.
>
> tr/// with empty replacement and without /d and m// in scalar context
> (with no capturing parens) are the operators you are looking for. Both
> just count, and don't replace anything.
>
> Ben
Indeed, I never realized that.
From perlfaq4, there is also a suggestion to count multi-byte
patterns:
<cite>
How can I count the number of occurrences of a substring within a
string?
There are a number of ways, with varying efficiency. If you want a
count
of a certain single character (X) within a string, you can use the
"tr///" function like so:
$string = "ThisXlineXhasXsomeXx'sXinXit";
$count = ($string =~ tr/X//);
print "There are $count X characters in the string";
This is fine if you are just looking for a single character.
However, if
you are trying to count multiple character substrings within a
larger
string, "tr///" won't work. What you can do is wrap a while() loop
around a global pattern match. For example, let's count negative
integers:
$string = "-9 55 48 -2 23 -76 4 14 -44";
while ($string =~ /-\d+/g) { $count++ }
print "There are $count negative numbers in the string";
Another version uses a global match in list context, then assigns
the
result to a scalar, producing a count of the number of matches.
$count = () = $string =~ /-\d+/g;
</cite>
Thank you,
Steffen
|