Click here to get back home

Questions about XML:LibXML

 HomeNewsGroups | Search | About
 comp.lang.perl.modules    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
Questions about XML:LibXML Fergus McMenemie 12-16-2007
Posted by Fergus McMenemie on December 16, 2007, 6:22 am
Please log in for more thread options
I am currently migrating several large scripts from XML::DOM to
XML::LIBXML and the transistion has hightligted a number of problems.
Both modules have rather poorly documented aspects and the move to
XML::LIBXML is proving painful.

A typical use of either module to read XML goes along the lines of

my $parser =XML::LibXML->new();
my $tree = $parser->parse_file($metafile);

XML::LIBXML seems to require an extra step of

my $pubmeta = $tree->getDocumentElement();

I think my misunderstandings starts at the very first line.
- What this $parser object do? If I am parsing multiple files
do I need a seperate parser instance for each file?

- Having used parse_file() within a subroutine do I need to
keep it around or is it just $tree that needs kept?

- Or can I get away with just keeping $pubmeta in scope?

In answering or commenting on the above bear in mind that I am
opening lots of XML files, merging elements from them into a master.

Thanks in advance Fergus.

Posted by xhoster on December 16, 2007, 7:09 pm
Please log in for more thread options
fergus@twig.demon.co.uk (Fergus McMenemie) wrote:
> I am currently migrating several large scripts from XML::DOM to
> XML::LIBXML and the transistion has hightligted a number of problems.

Out of my own curiosity, what is the driving force behind the change?

> Both modules have rather poorly documented aspects and the move to
> XML::LIBXML is proving painful.
>
> A typical use of either module to read XML goes along the lines of
>
> my $parser =XML::LibXML->new();
> my $tree = $parser->parse_file($metafile);
>
> XML::LIBXML seems to require an extra step of
>
> my $pubmeta = $tree->getDocumentElement();
>
> I think my misunderstandings starts at the very first line.
> - What this $parser object do?

It lets you set the default options to the parser. If you never do
that (and the docs don't even explain how you would go about doing it)
then it is pretty useless. Consider just another part of the bloat and rot
that seems to follow XML where ever it goes.

> If I am parsing multiple files
> do I need a seperate parser instance for each file?

No. But I would use one anyway. Creation of a XML::LibXML parser
is extremely light weight (unlike XML::DOM).

> - Having used parse_file() within a subroutine do I need to
> keep it around or is it just $tree that needs kept?
>
> - Or can I get away with just keeping $pubmeta in scope?

You can do even better, not even explicitly having the intermediaries:

my $pubmeta =
XML::LibXML->new()->parse_file($metafile)->getDocumentElement();


> In answering or commenting on the above bear in mind that I am
> opening lots of XML files, merging elements from them into a master.

Using the same parser over and over in XML::LibXML might save you around
one second for every few hundred thousand files or so. Not worth worrying
about in my book.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Posted by Fergus McMenemie on December 17, 2007, 5:39 pm
Please log in for more thread options

> fergus@twig.demon.co.uk (Fergus McMenemie) wrote:
> > I am currently migrating several large scripts from XML::DOM to
> > XML::LIBXML and the transistion has hightligted a number of problems.
>
> Out of my own curiosity, what is the driving force behind the change?

I have had to deal with the addition of dublin core elements to some of
documents and XML::DOM does not support namespaces.

Thanks for the help on the other points, I will use it with more
confidence now!

Similar ThreadsPosted
Where to ask questions about Win32::ODBC? July 14, 2004, 9:36 am
Asking questions in 'make test' July 27, 2004, 12:55 am
newbie LWP::UserAgent questions August 13, 2006, 8:10 pm
three Parse::RecDescent related questions October 7, 2005, 1:23 pm
First time module installation - questions April 17, 2008, 9:26 am
Net:TFTPd questions - Want to upload configs via TFTP July 20, 2006, 12:29 pm
Math Clipboard Guitest versions questions Feb 26, 2008 February 26, 2008, 12:37 pm
Possible bug in XML:LibXML December 16, 2007, 6:22 am
LibXML and DTD's July 5, 2007, 1:26 pm
data structure from XML::LibXML October 6, 2004, 6:22 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap