|
Posted by xhoster on March 13, 2008, 12:19 pm
Please log in for more thread options > use strict;
>
> my $file_1 = '1.txt'; # File 1
> my $file_2 = '2.txt'; # File 2
>
> if(open(FH1 , $file_1)){
> print "File $file_1 Opened\n";
> }else{
> print "Failed to Open file $file_1\n";
> exit;
> }
>
> if(open(FH2 , $file_2)){
> print "File $file_2 Opened\n";
> }else{
> print "Failed to Open file $file_2\n";
> close FH1;
> exit;
> }
>
> while(chomp(my $line_2 = <FH2>)){
> my($dummy21,$file21_no,$file21_date) = split(/\s+/,$line_2);
> next if($file21_no !~ /\d+/);
> my $counter1 = 0;
> my $least_date1 = 0;
> seek(FH1,0,0);
> $least_date1 = date_compare($file21_date);
> while(chomp(my $line_1 = <FH1>)){
> my($d,$file1_no,$file1_date) = split(/;/,$line_1);
> if($file1_no == $file21_no){
You could pre-load file1 into a hash (by $file1_no) of a list of
lines that have that $file1_no. That way for each line in file2, you
only need to go through those lines of file1 that already meet the
above condition. This by itself should greatly improve things unless
there most of the data is all in the same or just a few $file1_no.
> $file1_date =~/(\d\d\d\d)(\d\d)(\d\d)/;
> my $yr1 = $1;
> $file21_date =~/(\d\d\d\d)(\d\d)(\d\d)/;
> if(($yr1 - $1) < 5){
> $counter1++;
> }
And within a given $file1_no hashed list, you could sort by file1_date,
that way once you meet a non-qualifying date you could abort the loop
early rather than testing all the rest. (This improvement would probably
be quite small, compared to the previous one)
Xho
--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
|