Perl HTML::TableExtract Question

Hi !

I hope someone can help.

I want to extract data from a table with 2 columns.

A sample of the table can be generated with:-

" "

(Sorry about the long URL :-) )

What I want is the field from the top table Labelled - "Tot. Shares Out."

My Current Code is :-

#!/usr/bin/perl -w

use strict;
use HTML::TableExtract;

my $inFile = "/home/mas/development/URLTemp.tmp";
my $te = HTML::TableExtract->new( headers => [ 'Fundamental Data', '*' ]);
$te->parse_file( $inFile );
foreach my $ts ( $te->table_states ) {
         foreach my $row ( $ts->rows ) {
                 print join( ",", @$row, "," ), "\n";

But this seems to get the table lower down the page. This wouldn't be so
bad as it has the value I need repeated but - "How do I get an
un-labelled column ????"

Any help would be appreciated.


Re: Perl HTML::TableExtract Question

The headers approach will not work since there are no headers
on the table that contains the data that you are after.

"Tot. Shares Out." is the 7th column in the 12th row of the table
at depth=2 and count=1.

   my $te = HTML::TableExtract->new( depth => 2, count => 1);
   my $total_outstanding = ($ts->rows)[11]->[6];

    Tad McClellan                          SGML consulting                   Perl programming
    Fort Worth, Texas

Re: Perl HTML::TableExtract Question

Paul wrote:
" "
Just a bit more info on this - the ", '*'" doesn't work - in fact it
returns empty data. Without it it assumes that the rows below are what
is wanted and it returns:-

Market Capitalization,,

The real question is "How do I specify a row with a NULL header ??

Re: Perl HTML::TableExtract Question

Tad McClellan wrote:
Thanks for that Tad !! I got the same answer at about 0230 in the
morning :-(

It seems the page isn't very well constructed.

I spent lots of time looking for the new version of HTML::TableExtract
which is supposed to address rows as well as columns but could only find
fleeting references to it.


