Click here to get back home

Possible bug in XML:LibXML

 HomeNewsGroups | Search | About
 comp.lang.perl.modules    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
Possible bug in XML:LibXML Fergus McMenemie 12-16-2007
Posted by Fergus McMenemie on December 16, 2007, 6:22 am
Please log in for more thread options
The following test script is not doing what I expect. It must be
a bug:-) The docs I have seen state that "In a SCALAR context
getElementsByTagName returns a XML::LibXML::NodeList object.
Instead I see it returning a string of the combined text objects
from all the child nodes. I have seen this behaviour repeated on
Mac 10.4 and Solaris 10. XML::DOM behaves as documented.


use strict;
use XML::LibXML;

my($parser);
MAIN: {
$parser = XML::LibXML->new();
test4();
}

sub test4 {
print "\n","4"x50,"\n";
my $tree = $parser->parse_file('camelids.xml');
my $root = $tree->getDocumentElement;

my $x=($root->getElementsByTagName("common-name"));
print "\nx=$x\n"; # should be number of elements in an Element array..
# i think!

my $y=$root->getElementsByTagName("common-name");
print "\ny=$y\n"; # should be a NodeList

my($z)=$root->getElementsByTagName("common-name");
print "\nz=$z\n"; # should be 1st element of Element array

foreach my $camelid ($root->findnodes('species')) {
my $latin_name = $camelid->findvalue('@name');
my $common_name = $camelid->findvalue('common-name');
my $status = $camelid->findvalue('conservation/@status');
print "$common_name ($latin_name) $status \n";
}
}

The above example is based on a example from
http://www.xml.com/pub/a/2001/11/14/xml-libxml.html

I would be glad if any mistakes in what I have done were pointed
out. Or any hints on how I should take this further. I think I
see similar incorrect behaviour in find() and findnodes().

Thanks in advance Fergus.

Posted by xhoster on December 16, 2007, 5:14 pm
Please log in for more thread options
fergus@twig.demon.co.uk (Fergus McMenemie) wrote:
> The following test script is not doing what I expect. It must be
> a bug:-) The docs I have seen state that "In a SCALAR context
> getElementsByTagName returns a XML::LibXML::NodeList object.
> Instead I see it returning a string of the combined text objects
> from all the child nodes.

Exactly where are you "seeing" this?

> I have seen this behaviour repeated on
> Mac 10.4 and Solaris 10. XML::DOM behaves as documented.
>
> use strict;
> use XML::LibXML;
>
> my($parser);
> MAIN: {
> $parser = XML::LibXML->new();
> test4();
> }
>
> sub test4 {
> print "\n","4"x50,"\n";
> my $tree = $parser->parse_file('camelids.xml');

Where can we obtain camelids.xml in order to repeat your example?
(I've looked at the web site you linked to, and if it has it then
it isn't obvious.)

> my $root = $tree->getDocumentElement;
>
> my $x=($root->getElementsByTagName("common-name"));
> print "\nx=$x\n"; # should be number of elements in an Element array..
> # i think!

Despite the parenthesis, the method is called in a scalar context. So
it should behave identically to the next piece of code.

>
> my $y=$root->getElementsByTagName("common-name");
> print "\ny=$y\n"; # should be a NodeList

Unless you know there is no stringification overload for NodeList, then
this print statement doesn't tell you much of anything about what $y
actually is. Try using Data::Dumper instead.

In any event, you have told us neither what you expected to see nor what
you actually did see, I have no idea how to go about explaining this
invisible and possibly nonexistent discrepancy.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Posted by Fergus McMenemie on December 16, 2007, 5:55 pm
Please log in for more thread options

> fergus@twig.demon.co.uk (Fergus McMenemie) wrote:
> > The following test script is not doing what I expect. It must be
> > a bug:-) The docs I have seen state that "In a SCALAR context
> > getElementsByTagName returns a XML::LibXML::NodeList object.
> > Instead I see it returning a string of the combined text objects
> > from all the child nodes.
> Exactly where are you "seeing" this?
Here is what I see when I run the test script:-
./libxmltest.pl

44444444444444444444444444444444444444444444444444

x=Bactrian CamelDromedary, or Arabian CamelLlamaGuanacoVicuna

y=Bactrian CamelDromedary, or Arabian CamelLlamaGuanacoVicuna

z=XML::LibXML::Element=SCALAR(0x18427a4)
Bactrian Camel (Camelus bactrianus) endangered
Dromedary, or Arabian Camel (Camelus dromedarius) no special status
Llama (Lama glama) no special status
Guanaco (Lama guanicoe) special concern
Vicuna (Vicugna vicugna) endangered

> Where can we obtain camelids.xml in order to repeat your example?
> (I've looked at the web site you linked to, and if it has it then
> it isn't obvious.)
Sorry about that. I think it is in the zip file. However it is simpler
to get it from
http://search.cpan.org/src/KHAMPTON/XML-SemanticDiff-0.95/eg/camelids.xm
l

> > my $root = $tree->getDocumentElement;
> >
> > my $x=($root->getElementsByTagName("common-name"));
> > print "\nx=$x\n"; # should be number of elements in an Element array..
> > # i think!
> Despite the parenthesis, the method is called in a scalar context. So
> it should behave identically to the next piece of code.
Yep. Ok.

> > my $y=$root->getElementsByTagName("common-name");
> > print "\ny=$y\n"; # should be a NodeList
>
> Unless you know there is no stringification overload for NodeList, then
> this print statement doesn't tell you much of anything about what $y
> actually is. Try using Data::Dumper instead.
Tomorrow!

> In any event, you have told us neither what you expected to see nor what
> you actually did see, I have no idea how to go about explaining this
> invisible and possibly nonexistent discrepancy.
I expected to see something similar to:-

44444444444444444444444444444444444444444444444444

x=XML::DOM::NodeList=ARRAY(0x18bb880)

y=XML::DOM::NodeList=ARRAY(0x180127c)

z=XML::DOM::Element=ARRAY(0x18fa3bc)
...... rest snipped....


Posted by xhoster on December 16, 2007, 6:26 pm
Please log in for more thread options
fergus@twig.demon.co.uk (Fergus McMenemie) wrote:
>
> > Where can we obtain camelids.xml in order to repeat your example?
> > (I've looked at the web site you linked to, and if it has it then
> > it isn't obvious.)
> Sorry about that. I think it is in the zip file. However it is simpler
> to get it from
> http://search.cpan.org/src/KHAMPTON/XML-SemanticDiff-0.95/eg/camelids.xm
> l

OK, thanks.

>
> > > my $y=$root->getElementsByTagName("common-name");
> > > print "\ny=$y\n"; # should be a NodeList
> >
> > Unless you know there is no stringification overload for NodeList, then
> > this print statement doesn't tell you much of anything about what $y
> > actually is. Try using Data::Dumper instead.
> Tomorrow!

I'll save you the weekend. As I suspected, XML::LibXML::NodeList does have
stringification overloaded. So when interpolated into "", it doesn't use
the default reference/object stringification like
XML::DOM::NodeList=ARRAY(0x18bb880) does, but instead it just crams
together the string version of all its elements. I don't know the rational
for this decision, and I would probably disagree with it (I'd think I'd at
least use $" to join the strings, rather then ''). But it is what it is.

With Dumper:

bless( [
bless( do{\(my $o = 6429552)}, 'XML::LibXML::Element' ),
bless( do{\(my $o = 6430304)}, 'XML::LibXML::Element' ),
bless( do{\(my $o = 6430976)}, 'XML::LibXML::Element' ),
bless( do{\(my $o = 6430720)}, 'XML::LibXML::Element' ),
bless( do{\(my $o = 6430272)}, 'XML::LibXML::Element' )
], 'XML::LibXML::NodeList' );


Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Similar ThreadsPosted
LibXML and DTD's July 5, 2007, 1:26 pm
Questions about XML:LibXML December 16, 2007, 6:22 am
LibXML on Redhat 4 x64 November 20, 2008, 11:58 am
data structure from XML::LibXML October 6, 2004, 6:22 pm
compile problems with XML::LibXML December 22, 2004, 5:41 pm
problems with installation of XML::LibXML June 7, 2005, 10:30 am
namespace declarations in LibXML April 15, 2006, 12:48 am
XML::LibXML and getting data from elements/nodes October 26, 2004, 7:50 pm
LibXML "Undefined namespace prefix" July 2, 2007, 5:42 pm
XML::LibXML::Common does not install Common.pm November 27, 2007, 11:21 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap