Click here to get back home

WWW::Mechanize doesn't always follow_link(text

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
WWW::Mechanize doesn't always follow_link(text M.O.B. i L. 04-20-2008
Get Chitika Premium
Posted by M.O.B. i L. on April 20, 2008, 1:45 pm
Please log in for more thread options
I'm using WWW::Mechanize 1.34 and have a problem.
This doesn't work:
$agent->follow_link(text => 'Edit Librarians', n => 1);
It doesn't work in the sense that the link isn't followed and the $agent
is still on the same page. Is there a bug in my code or is there a known
bug in WWW::Mechanize. I've tried to change   to space but that
didn't work.

This works:
$agent->follow_link(url_regex => qr/librarians/, n => 1);

The corresponding XHTML code is:
<a href="mkbAdmin?func=librarians&amp;lang=en">Edit&nbsp;Librarians</a>

I want it to work since I use HTTP::Recorder to generate the code
automatically as I surf using a proxy and it generates code of the type
that doesn't work.

This works:
$agent->follow_link(text => 'Logout', n => 1);

By the way HTTP::Recorder actually generates:
$agent->follow_link(text => 'Edit&nbsp;Librarians', n => '1');

Posted by John Bokma on April 21, 2008, 1:34 pm
Please log in for more thread options

> I'm using WWW::Mechanize 1.34 and have a problem.
> This doesn't work:
> $agent->follow_link(text => 'Edit&nbsp;Librarians', n => 1);
> It doesn't work in the sense that the link isn't followed and the $agent
> is still on the same page. Is there a bug in my code or is there a known
> bug in WWW::Mechanize. I've tried to change &nbsp; to space but that
> didn't work.
>
> This works:
> $agent->follow_link(url_regex => qr/librarians/, n => 1);
>
> The corresponding XHTML code is:
> <a href="mkbAdmin?func=librarians&amp;lang=en">Edit&nbsp;Librarians</a>
>
> I want it to work since I use HTTP::Recorder to generate the code
> automatically as I surf using a proxy and it generates code of the type
> that doesn't work.
>
> This works:
> $agent->follow_link(text => 'Logout', n => 1);
>
> By the way HTTP::Recorder actually generates:
> $agent->follow_link(text => 'Edit&nbsp;Librarians', n => '1');

HTML::TreeBuilder, or a module it's using, returns &nbsp; as a single
character, it might be that you have to
use the code instead.

Comment on http://johnbokma.com/perl/search-term-suggestion-tool.html
says: (&nbsp;, stored as char 225)

So you might want to try: "Edit\xe1Librarians".

Wild guess.

--
John

Arachnids near Coyolillo
http://johnbokma.com/perl/

Posted by M.O.B. i L. on April 23, 2008, 8:09 am
Please log in for more thread options
John Bokma wrote:
>
>> I'm using WWW::Mechanize 1.34 and have a problem.
>> This doesn't work:
>> $agent->follow_link(text => 'Edit&nbsp;Librarians', n => 1);
>> It doesn't work in the sense that the link isn't followed and the $agent
>> is still on the same page. Is there a bug in my code or is there a known
>> bug in WWW::Mechanize. I've tried to change &nbsp; to space but that
>> didn't work.
>>
>> This works:
>> $agent->follow_link(url_regex => qr/librarians/, n => 1);
>>
>> The corresponding XHTML code is:
>> <a href="mkbAdmin?func=librarians&amp;lang=en">Edit&nbsp;Librarians</a>
>>
>> I want it to work since I use HTTP::Recorder to generate the code
>> automatically as I surf using a proxy and it generates code of the type
>> that doesn't work.
>>
>> This works:
>> $agent->follow_link(text => 'Logout', n => 1);
>>
>> By the way HTTP::Recorder actually generates:
>> $agent->follow_link(text => 'Edit&nbsp;Librarians', n => '1');
>
> HTML::TreeBuilder, or a module it's using, returns &nbsp; as a single
> character, it might be that you have to
> use the code instead.
>
> Comment on http://johnbokma.com/perl/search-term-suggestion-tool.html
> says: (&nbsp;, stored as char 225)
>
> So you might want to try: "Edit\xe1Librarians".
>
> Wild guess.
>
Thanks! But it should be \xa0. First I tried matching with regular
expressions and that worked using . (dot) for the unknown character. I
then found this page about &nbsp;
<http://www.w3.org/International/questions/qa-escapes> where it says:
"An example of an ambiguous character is 00A0: NO-BREAK SPACE. This type
of space prevents line breaking, but it looks just like any other space
when used as a character. Using &nbsp; (or &#xA0;) makes it quite clear
where such spaces appear in the text.".

So this works:
$agent->follow_link(text => "Edit\xa0Librarians", n => 1);

Posted by John Bokma on April 23, 2008, 12:07 pm
Please log in for more thread options

> John Bokma wrote:

[..]

>> HTML::TreeBuilder, or a module it's using, returns &nbsp; as a single
>> character, it might be that you have to
>> use the code instead.
>>
>> Comment on http://johnbokma.com/perl/search-term-suggestion-tool.html
>> says: (&nbsp;, stored as char 225)
>>
>> So you might want to try: "Edit\xe1Librarians".
>>
>> Wild guess.
>>
> Thanks! But it should be \xa0.

Yeah, but HTML::TreeBuilder returns it as 225 :-D.

[..]

> So this works:
> $agent->follow_link(text => "Edit\xa0Librarians", n => 1);

Glad my post was able to help you in the right way.

--
John

http://johnbokma.com/perl/

Posted by szr on April 25, 2008, 7:15 pm
Please log in for more thread options
John Bokma wrote:
>
>> John Bokma wrote:
>
> [..]
>
>>> HTML::TreeBuilder, or a module it's using, returns &nbsp; as a
>>> single character, it might be that you have to
>>> use the code instead.
>>>
>>> Comment on
>>> (&nbsp;, stored as char 225)
>>>
>>> So you might want to try: "Edit\xe1Librarians".
>>>
>>> Wild guess.
>>>
>> Thanks! But it should be \xa0.
>
> Yeah, but HTML::TreeBuilder returns it as 225 :-D.

He's after a '&nbsp;', which us a non-breaking space, which is ASCII
0xA0 hex or 160 dec. '&nbsp;' can even be re-written as '&#160;' .

--
szr



Similar ThreadsPosted
use WWW::Mechanize; May 11, 2006, 6:28 pm
LWP::UserAgent & Mechanize August 1, 2004, 5:44 am
tricks against WWW::Mechanize April 10, 2005, 6:48 pm
Understanding Mechanize August 19, 2005, 4:23 am
WWW::Mechanize issue November 15, 2005, 7:18 pm
using perl mechanize January 10, 2006, 5:12 pm
selenium with www::mechanize September 12, 2006, 6:52 am
Mechanize location October 8, 2006, 10:17 pm
www::mechanize and forms November 5, 2006, 4:47 pm
WWW::Mechanize question July 5, 2007, 2:37 am

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap