|
Posted by John W. Kennedy on July 29, 2004, 12:19 am
Please log in for more thread options
I'm trying to build an HTML-altering program using HTTP::TreeBuilder,
but am having a problem dealing with comments. I have "savecomments"
turned on, and the comments are being put into the tree, but they're all
dumped in at the end, between </body> and </html>.
The additional error with DOCTYPE is documented, and I can deal with it,
but the destruction of the context of comments is ruinous for my purposes.
Current program:
use strict;
use warnings;
use HTML::TreeBuilder;
my $tree = HTML::TreeBuilder->new();
$tree->store_comments (1);
$tree->store_declarations (1);
$tree->parse_file ('text.html');
$tree->elementify;
print $tree->as_HTML;
$tree->delete;
Current input:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!--This is a comment before html-->
<html>
<!--This is a comment before head-->
<head>
<title>Test</title>
</head>
<body>
<p>This is it.</p>
</body>
</html>
<!--This is a comment after html-->
Current output:
<html><head><title>Test</title></head><body><p>This is
it.</body><!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
Transitional//EN"><!--This is a comment before html--><!--This is a
comment before head--><!--This is a comment after html--></html>
|