Click here to get back home

Changing module bahavior WWW::Mechanize

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
Changing module bahavior WWW::Mechanize David Harmon 04-27-2008
Posted by David Harmon on April 27, 2008, 7:59 pm
Please log in for more thread options
So I want to fetch some web pages and then grab some image files they
refer to. The images are linked with the standard <img src="... html
tag. WWW::Mechanize is perfect for that, except that the links it
scans for do not include image tags. Hunting in Mechanize.pm I find
the list it uses, which is:

my %urltags = (
a => "href",
area => "href",
frame => "src",
iframe => "src",
);

If I add one line to that, everything works like I want it too:

img => "src",

But I don't really want to modify my copy of Mechanize! My code
won't work on any other installation. Every time I installed a
updated version, I would have to fix it again (well, that would be
seldom, but.)

What to do? Is there a legitimate way to pass that request to
Mechanize?

Oh and, what do I know, I tried the following in my code and I am not
sure why it didn't work:
$WWW::Mechanize::urltags = "src";
.

Posted by Gunnar Hjalmarsson on April 28, 2008, 12:08 am
Please log in for more thread options
David Harmon wrote:
> So I want to fetch some web pages and then grab some image files they
> refer to. The images are linked with the standard <img src="... html
> tag. WWW::Mechanize is perfect for that, except that the links it
> scans for do not include image tags. Hunting in Mechanize.pm I find
> the list it uses, which is:
>
> my %urltags = (
> a => "href",
> area => "href",
> frame => "src",
> iframe => "src",
> );
>
> If I add one line to that, everything works like I want it too:
>
> img => "src",
>
> But I don't really want to modify my copy of Mechanize! My code
> won't work on any other installation. Every time I installed a
> updated version, I would have to fix it again (well, that would be
> seldom, but.)
>
> What to do? Is there a legitimate way to pass that request to
> Mechanize?

Don't know. Maybe you should use some other module, such as
HTML::SimpleLinkExtor.

See also

perldoc -q "extract URLs"

> Oh and, what do I know, I tried the following in my code and I am not
> sure why it didn't work:
> $WWW::Mechanize::urltags = "src";

That's easily answered. %urltags is a my() declared private variable,
and can consequently not be accessed from outside Mechanize.pm.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Posted by Joost Diepenmaat on April 28, 2008, 7:46 am
Please log in for more thread options

> So I want to fetch some web pages and then grab some image files they
> refer to. The images are linked with the standard <img src="... html
> tag. WWW::Mechanize is perfect for that, except that the links it
> scans for do not include image tags. Hunting in Mechanize.pm I find
> the list it uses, which is:

The links() method searches for links "to to other pages". If you want
to search for images you should use the (find(_all)_)images() methods.

> Oh and, what do I know, I tried the following in my code and I am not
> sure why it didn't work:
> $WWW::Mechanize::urltags = "src";
> .

That doesn't work because %urltags in the Mechanize source is a
lexical variable, not a package variable.


--
Joost Diepenmaat | blog: http://joost.zeekat.nl/ | work: http://zeekat.nl/

Posted by Gunnar Hjalmarsson on April 28, 2008, 11:55 am
Please log in for more thread options
Joost Diepenmaat wrote:
>> So I want to fetch some web pages and then grab some image files they
>> refer to. The images are linked with the standard <img src="... html
>> tag. WWW::Mechanize is perfect for that, except that the links it
>> scans for do not include image tags. Hunting in Mechanize.pm I find
>> the list it uses, which is:
>
> The links() method searches for links "to to other pages". If you want
> to search for images you should use the (find(_all)_)images() methods.

Both the OP and I had old versions of WWW::Mechanize, where those
methods were not available.

So, David, you need to upgrade.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Posted by David Harmon on April 28, 2008, 2:03 pm
Please log in for more thread options
On Mon, 28 Apr 2008 17:55:39 +0200 in comp.lang.perl.misc, Gunnar
>Both the OP and I had old versions of WWW::Mechanize, where those
>methods were not available.
>
>So, David, you need to upgrade.

My goodness, how I needed to upgrade.

ActiveState PerlPackage Manager showed that I had installed the
latest version available of WWW::Mechanize, 0.72 dated 2001.

I now have version 1.34 direct from CPAN. The difference is huge.

.

Similar ThreadsPosted
newbie - changing value of lexical from outside of module June 6, 2006, 11:23 am
perl doc url changing September 1, 2005, 2:52 am
$ftp->??? for changing local directory January 6, 2006, 8:31 am
changing .cpan location March 31, 2006, 1:27 pm
Changing the inherited STDOUT May 30, 2006, 7:24 am
Changing one array affects the other March 27, 2007, 3:09 am
Optionally changing file perms? September 23, 2004, 2:30 pm
Changing remote_addr on incoming request September 14, 2005, 2:04 pm
Math::NumberCruncher changing scalar into... something else November 15, 2005, 11:57 am
Changing seperator in large CSV files? January 18, 2006, 12:42 am

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap