Questions about XML:LibXML

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
I am currently migrating several large scripts from XML::DOM to
XML::LIBXML and the transistion has hightligted a number of problems.
Both modules have rather poorly documented aspects and the move to
XML::LIBXML is proving painful.

A typical use of either module to read XML goes along the lines of

    my $parser =XML::LibXML->new();
    my $tree = $parser->parse_file($metafile);
XML::LIBXML seems to require an extra step of
    my $pubmeta = $tree->getDocumentElement();

I think my misunderstandings starts at the very first line.
   - What this $parser object do? If I am parsing multiple files
     do I need a seperate parser instance for each file?
   - Having used parse_file() within a subroutine do I need to
     keep it around or is it just $tree that needs kept?
   - Or can I get away with just keeping $pubmeta in scope?

In answering or commenting on the above bear in mind that I am
opening lots of XML files, merging elements from them into a master.

Thanks in advance Fergus.

Re: Questions about XML:LibXML (Fergus McMenemie) wrote:
Quoted text here. Click to load it

Out of my own curiosity, what is the driving force behind the change?

Quoted text here. Click to load it

It lets you set the default options to the parser.  If you never do
that (and the docs don't even explain how you would go about doing it)
then it is pretty useless.  Consider just another part of the bloat and rot
that seems to follow XML where ever it goes.

Quoted text here. Click to load it

No.  But I would use one anyway.  Creation of a XML::LibXML parser
is extremely light weight (unlike XML::DOM).

Quoted text here. Click to load it

You can do even better, not even explicitly having the intermediaries:

my $pubmeta =

Quoted text here. Click to load it

Using the same parser over and over in XML::LibXML might save you around
one second for every few hundred thousand files or so.  Not worth worrying
about in my book.


-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Re: Questions about XML:LibXML

Quoted text here. Click to load it

I have had to deal with the addition of dublin core elements to some of
documents and XML::DOM does not support namespaces.

Thanks for the help on the other points, I will use it with more
confidence now!

Site Timeline