handle tab-delimited file

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
\t matches BOTH tab and space.

How can I split the following line into 2 words instead of 5?

1234\tI am a boy\n

Re: handle tab-delimited file

Ela wrote:
Quoted text here. Click to load it

No, it doesn't.

Quoted text here. Click to load it

     split /\t/, "1234\tI am a boy\n"

Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Re: handle tab-delimited file

Quoted text here. Click to load it

No it doesn't.

\s matches tab and space (and 3 other characters).

Is that what you meant?

(we wouldn't need to ask this if you had posted real Perl code.)

Quoted text here. Click to load it

use PSI::ESP;

   By spliting on \t rather than spliting on \s

Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher0cmdat/"

Re: handle tab-delimited file

On Sat, 15 Mar 2008 14:10:12 +0000, Tad J McClellan wrote:

Quoted text here. Click to load it

Don't forget your Ogham space mark:

use warnings;
use strict;
use Unicode::UCD 'charinfo';
sub count_match
    my ($re)=@_;
    my $c;
    for my $n (0x00 .. 0xD7FF, 0xE000 .. 0xFDCF, 0xFDF0.. 0xFFFD) {
    if (chr($n) =~ /$re/) {
        my $ci = charinfo($n);
        print sprintf ('%02X', $n), " which is ", $$ci
        , " matches\n";
    print "There are $c characters matching \"$re\".\n";

which gives:

09 which is <control> matches
0A which is <control> matches
0C which is <control> matches
0D which is <control> matches
20 which is SPACE matches
1680 which is OGHAM SPACE MARK matches
2000 which is EN QUAD matches
2001 which is EM QUAD matches
2002 which is EN SPACE matches
2003 which is EM SPACE matches
2004 which is THREE-PER-EM SPACE matches
2005 which is FOUR-PER-EM SPACE matches
2006 which is SIX-PER-EM SPACE matches
2007 which is FIGURE SPACE matches
2008 which is PUNCTUATION SPACE matches
2009 which is THIN SPACE matches
200A which is HAIR SPACE matches
2028 which is LINE SEPARATOR matches
2029 which is PARAGRAPH SEPARATOR matches
202F which is NARROW NO-BREAK SPACE matches
3000 which is IDEOGRAPHIC SPACE matches
There are 23 characters matching "\s".

Site Timeline