|
Posted by Christoph Terhechte on April 25, 2005, 7:23 pm
Please log in for more thread options
Hi all,
I've written a utility that reads OpenOffice Spreadsheet data into a Perl
structure, actually a hash (the keys correspond to worksheet names) of
arrays of arrays (the latter correspond to rows and cells, respectively.
It's a simple program that currently relies on XML::Parser::Lite::Tree for
parsing the XML content of sxc files.
The reason I'm asking for advice is that I'm unsure whether it's too close
to the existing OpenOffice::Parse::SXC module which is based on
XML::Parser. The main differences of the module I have in mind: a) It
returns the different worksheets as hash elements. In
OpenOffice::Parse::SXC you have to write a handler to achieve this. b) It
returns undef for empty cells, where OpenOffice::Parse::SXC returns an
empty string. Undef is better suited for importing data into a database
(which is what I've written the code for). c) It honors the
"number-rows-repeated" argument of SXC files, which is ignored by
OpenOffice::Parse::SXC. d) It optionally returns data any of several
encodings. On the other hand, my code is much less sophisticated and
doesn't allow flexible use of the module through handlers, as
OpenOffice::Parse::SXC does.
I haven't yet uploaded anything to CPAN, and although I really needed
something different than the existing modules, I'm unsure whether it's
wise to add yet another one. Regarding the namespace, I tend away from
OpenOffice::Foo, as my code is less about OpenOffice than it is about
importing spreadsheet data, so I am thinking of Spreadsheet::ParseSXC.
Please let me know what you think.
--
Christoph Terhechte
|