|
Posted by BH on February 19, 2008, 12:15 pm
Please log in for more thread options
Hi,
I am reading in an comma separator file,
line 1: Field 1, field 2, field 3.., field n
line 2: Field 1, field 2, field 3.., field n
I would like to create a data structure to representing the whole
file.
What is an efficient and easy way to store the above such that, for a
given value for field i, I can extract an array of values for field j,
where 1<=i,j<=n? It's a bit like asking what the best way to store a
spreadsheet is.
Hash of anonymous arrays? How will the syntax be?
Thanks in advance.
Regards,
BH
|
|
Posted by A. Sinan Unur on February 19, 2008, 1:36 pm
Please log in for more thread options
7dd3346329d7@41g2000hsc.googlegroups.com:
> Hi,
>
> I am reading in an comma separator file,
>
> line 1: Field 1, field 2, field 3.., field n
> line 2: Field 1, field 2, field 3.., field n
>
> I would like to create a data structure to representing the whole
> file.
>
> What is an efficient and easy way to store the above such that, for a
> given value for field i, I can extract an array of values for field j,
> where 1<=i,j<=n? It's a bit like asking what the best way to store a
> spreadsheet is.
>
> Hash of anonymous arrays? How will the syntax be?
First off, have you tried DBD-CSV?
http://search.cpan.org/~jzucker/DBD-CSV-0.22/lib/DBD/CSV.pm
This has the advantage that your code can keep on working when the data
are moved to a real database later on.
* How large is the data matrix?
* How large is the typical file?
* How large is the typical result set?
If the file is smallish (dunno, maybe fewer than 10,000 lines with say
10 fields each), the contents static and simple, the queries really as
simple as you make them out to be the obvious solution would be the way
to go.
Obvious solution follows. Generalize as required.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my @data;
while ( my $obs = <DATA> ) {
$obs =~ s/^\s+//;
$obs =~ s/\s+$//;
next unless length $obs;
push @data, [ split /\s+/, $obs ];
}
print Dumper vlookup( \@data, 2, 50, 3 );
print Dumper vlookup( \@data, 0, 100, 0);
print Dumper vlookup( \@data, 3, 0, 3);
print Dumper vlookup( \@data, 3, 0, 1);
sub vlookup {
my ( $data, $key_col, $key_val, $lookup_col ) = @_;
my @retlist;
for my $obs ( @$data ) {
if ( $obs->[ $key_col ] == $key_val ) {
push @retlist, $obs->[ $lookup_col ];
}
}
return \@retlist;
}
__DATA__
70 140 45 85
190 150 40 30
80 40 45 30
110 20 30 5
50 140 70 95
50 80 60 65
170 150 45 40
150 0 30 0
10 30 50 60
30 130 70 100
110 160 75 75
60 120 80 80
170 130 33 30
90 140 75 75
120 90 35 35
130 40 5 5
50 170 100 100
120 30 5 5
90 170 90 90
190 150 30 30
60 20 30 30
130 150 70 60
0 20 60 60
40 120 70 90
70 160 100 95
180 150 20 35
110 70 40 30
140 140 50 50
170 30 0 0
70 130 60 80
80 20 20 20
100 130 65 65
190 20 0 0
110 30 0 10
180 70 0 0
180 50 0 0
40 120 45 90
60 70 55 55
140 160 60 60
60 140 90 90
150 80 20 15
40 10 30 35
160 100 20 20
80 170 90 95
150 90 30 20
90 70 40 40
40 0 30 30
170 50 0 0
110 180 80 85
--
(remove .invalid and reverse each component for email address)
clpmisc guidelines: <URL:http://www.rehabitation.com/clpmisc.shtml>
|
|
Posted by Jim Gibson on February 19, 2008, 2:09 pm
Please log in for more thread options In article
> Hi,
>
> I am reading in an comma separator file,
>
> line 1: Field 1, field 2, field 3.., field n
> line 2: Field 1, field 2, field 3.., field n
>
> I would like to create a data structure to representing the whole
> file.
>
> What is an efficient and easy way to store the above such that, for a
> given value for field i, I can extract an array of values for field j,
> where 1<=i,j<=n? It's a bit like asking what the best way to store a
> spreadsheet is.
>
> Hash of anonymous arrays? How will the syntax be?
The most efficient structure is probably an array of arrays. The syntax
for generating this structure is something like the following:
#!/usr/local/bin/perl
use strict;
use warnings;
use Data::Dumper;
my @aoa;
while( <DATA> ) {
chomp;
push( @aoa, [ split /,/, $_ ]);
}
print Dumper(\@aoa);
__DATA__
Field 1.1, field 1.2, field 1.3
Field 2.1, field 2.2, field 2.3
You can access these elements with expressions such as $aoa[1][2],
which has the value ' field 2.3' (note the leading space, which can be
removed if it is a problem. Hint; change the regular expression used by
the split function).
--
Jim Gibson
Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com
|
|
Posted by Mark Pryor on February 19, 2008, 2:16 pm
Please log in for more thread options On Tue, 19 Feb 2008 09:15:35 -0800, BH wrote:
> Hi,
>
> I am reading in an comma separator file,
>
> line 1: Field 1, field 2, field 3.., field n line 2: Field 1, field 2,
> field 3.., field n
>
> I would like to create a data structure to representing the whole file.
>
> What is an efficient and easy way to store the above such that, for a
> given value for field i, I can extract an array of values for field j,
> where 1<=i,j<=n? It's a bit like asking what the best way to store a
> spreadsheet is.
>
> Hash of anonymous arrays? How will the syntax be?
>
I've had good luck using Text::CSV_XS
I like to load into an AoA (array of anonymous arrays)
You can also use the Storable module to export the whole AoA to its own
file and avoid parsing with Text::CSV_XS (resource hog) each time you run
the script.
On subsequent runs use the ::Retrieve() call to reload into an AoA.
Your code will instantly be x-platform too if you find a PPD for
Text::CSV_XS to run on WinXP. Its out there.
--
Mark
|
|
Posted by david on February 21, 2008, 11:52 am
Please log in for more thread options > Hi,
>
> I am reading in an comma separator file,
>
> line 1: Field 1, field 2, field 3.., field n
> line 2: Field 1, field 2, field 3.., field n
>
> I would like to create a data structure to representing the whole
> file.
>
> What is an efficient and easy way to store the above such that, for a
> given value for field i, I can extract an array of values for field j,
> where 1<=i,j<=n? It's a bit like asking what the best way to store a
> spreadsheet is.
>
> Hash of anonymous arrays? How will the syntax be?
>
> Thanks in advance.
>
> Regards,
>
> BH
Hi,
Lokk at Data::Table,
Best regards,
David
|
| Similar Threads | Posted | | perl - data structure build to transpose data | August 28, 2004, 9:34 pm |
| eval of Data::Dumper output not same as original data structure | August 1, 2007, 3:05 am |
| Perl Complex Data Structure Parsing | December 3, 2006, 1:56 pm |
| puting an excel file into a perl data structure | June 28, 2006, 12:16 pm |
| Perl data structure to hold same multiple keys and different values | September 17, 2007, 6:38 am |
| data structure | December 8, 2004, 12:09 pm |
| data structure problem | January 27, 2006, 3:39 pm |
| A s pecial data structure | May 12, 2006, 5:31 am |
| Parse data structure | July 15, 2006, 8:03 pm |
| Too complex data structure? | March 1, 2007, 8:29 am |
|