Click here to get back home

FAQ 5.3 How do I count the number of lines in a file?

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
FAQ 5.3 How do I count the number of lines in a file? PerlFAQ Server 03-05-2008
Posted by PerlFAQ Server on March 5, 2008, 9:03 pm
Please log in for more thread options
This is an excerpt from the latest version perlfaq5.pod, which
comes with the standard Perl distribution. These postings aim to
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

5.3: How do I count the number of lines in a file?


One fairly efficient way is to count newlines in the file. The following
program uses a feature of tr///, as documented in perlop. If your text
file doesn't end with a newline, then it's not really a proper text
file, so this may report one fewer line than you expect.

$lines = 0;
open(FILE, $filename) or die "Can't open `$filename': $!";
while (sysread FILE, $buffer, 4096) {
$lines += ($buffer =~ tr/\n//);
}
close FILE;

This assumes no funny games with newline translations.



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in
perlfaq.pod.

Posted by jm on March 6, 2008, 5:11 pm
Please log in for more thread options
PerlFAQ Server a écrit :

> 5.3: How do I count the number of lines in a file?
>
>
> One fairly efficient way is to count newlines in the file. The following
> program uses a feature of tr///, as documented in perlop. If your text
> file doesn't end with a newline, then it's not really a proper text
> file, so this may report one fewer line than you expect.
>
> $lines = 0;
> open(FILE, $filename) or die "Can't open `$filename': $!";
> while (sysread FILE, $buffer, 4096) {
> $lines += ($buffer =~ tr/\n//);
> }
> close FILE;
>
> This assumes no funny games with newline translations.


How does this code handle other unicode new line codages, such as 0x85,
0x0d 0x0a?

Posted by brian d foy on March 7, 2008, 12:11 pm
Please log in for more thread options

> PerlFAQ Server a écrit :
>
> > 5.3: How do I count the number of lines in a file?

> How does this code handle other unicode new line codages, such as 0x85,
> 0x0d 0x0a?

If you have a different idea of the human concept of "line", you'll
have to adjust the code to have the right line ending (and probably not
use tr///).

Posted by jm on March 7, 2008, 6:01 pm
Please log in for more thread options
brian d foy a écrit :
>
>> PerlFAQ Server a écrit :
>>
>>> 5.3: How do I count the number of lines in a file?
>
>> How does this code handle other unicode new line codages, such as 0x85,
>> 0x0d 0x0a?
>
> If you have a different idea of the human concept of "line", you'll
> have to adjust the code to have the right line ending (and probably not
> use tr///).

I do not remember where I read it, but
specificity of computers standards is there are so many.


I assume the following can be used:

«A newline sequence is defined to be any of the following:

\u000A | \u000B | \u000C | \u000D | \u0085 | \u2028 | \u2029 |
\u000D\u000A »

This regular expression comes from:

http://www.unicode.org/unicode/reports/tr18/
Unicode Technical Standard #18
Unicode Regular Expressions

http://unicode.org/reports/tr13/tr13-9.html
Unicode Standard Annex #13
Unicode Newline Guidelines




However it is not always true, because with some encodings (cp850 for
instance), 0x85 is plain character.



PS: If I understand well, there is also a report which allow to
automagically display newlines when necessary according to
opportunities... but this is another issue.
http://www.unicode.org/reports/tr14/
Unicode Standard Annex #14
Line Breaking Properties

Posted by Peter J. Holzer on March 9, 2008, 9:28 am
Please log in for more thread options
> brian d foy a écrit :
>>> PerlFAQ Server a écrit :
>>>
>>>> 5.3: How do I count the number of lines in a file?
>>
>>> How does this code handle other unicode new line codages, such as 0x85,
>>> 0x0d 0x0a?
>>
>> If you have a different idea of the human concept of "line", you'll
>> have to adjust the code to have the right line ending (and probably not
>> use tr///).

For perl, a newline is "\n". Conversion to and from some file encoding
should be done with the appropriate IO layer.


> I do not remember where I read it, but
> specificity of computers standards is there are so many.
>
>
> I assume the following can be used:
>
> «A newline sequence is defined to be any of the following:
>
> \u000A | \u000B | \u000C | \u000D | \u0085 | \u2028 | \u2029 |
> \u000D\u000A »

If all of these were just "newline sequences", they should be turned
into "\n" by the crlf conversion of the encoding(utf-*) layers. But they
aren't. For example, \u000C signifies not just a new line, but a new page.


> However it is not always true, because with some encodings (cp850 for
> instance), 0x85 is plain character.

cp850 doesn't have anything to do with unicode, so that doesn't seem
relevant.

        hp


Similar ThreadsPosted
FAQ 5.3: How do I count the number of lines in a file? December 4, 2004, 6:03 am
FAQ 5.3: How do I count the number of lines in a file? January 6, 2005, 12:03 am
FAQ 5.3 How do I count the number of lines in a file? January 30, 2005, 12:03 pm
FAQ 5.3 How do I count the number of lines in a file? May 13, 2005, 11:03 pm
FAQ 5.3 How do I count the number of lines in a file? July 29, 2005, 10:03 am
FAQ 5.3 How do I count the number of lines in a file? September 12, 2005, 4:03 am
FAQ 5.3 How do I count the number of lines in a file? November 14, 2005, 11:03 pm
FAQ 5.3 How do I count the number of lines in a file? April 23, 2006, 3:03 pm
FAQ 5.3 How do I count the number of lines in a file? August 8, 2006, 3:03 pm
FAQ 5.3 How do I count the number of lines in a file? November 17, 2006, 9:03 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap