|
Posted by Gunnar Hjalmarsson on April 28, 2008, 12:08 am
Please log in for more thread options
David Harmon wrote:
> So I want to fetch some web pages and then grab some image files they
> refer to. The images are linked with the standard <img src="... html
> tag. WWW::Mechanize is perfect for that, except that the links it
> scans for do not include image tags. Hunting in Mechanize.pm I find
> the list it uses, which is:
>
> my %urltags = (
> a => "href",
> area => "href",
> frame => "src",
> iframe => "src",
> );
>
> If I add one line to that, everything works like I want it too:
>
> img => "src",
>
> But I don't really want to modify my copy of Mechanize! My code
> won't work on any other installation. Every time I installed a
> updated version, I would have to fix it again (well, that would be
> seldom, but.)
>
> What to do? Is there a legitimate way to pass that request to
> Mechanize?
Don't know. Maybe you should use some other module, such as
HTML::SimpleLinkExtor.
See also
perldoc -q "extract URLs"
> Oh and, what do I know, I tried the following in my code and I am not
> sure why it didn't work:
> $WWW::Mechanize::urltags = "src";
That's easily answered. %urltags is a my() declared private variable,
and can consequently not be accessed from outside Mechanize.pm.
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
|