Click here to get back home

WWW::Mechanize doesn't always follow_link(text

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
WWW::Mechanize doesn't always follow_link(text M.O.B. i L. 04-20-2008
Get Chitika Premium
Posted by Dr.Ruud on April 28, 2008, 5:40 am
Please log in for more thread options
Martijn Lievaart schreef:
> Dr.Ruud:
>> RedGrittyBrick:
>>> szr:

>>>> He's after a ' ', which us a non-breaking space, which is
>>>> ASCII 0xA0 hex or 160 dec. ' ' can even be re-written as
>>>> ' ' .
>>>
>>> s/ASCII/Unicode/
>>
>> Exactly. ISO-8859-* too.
>
> No, no, HTML uses Unicode codepoints (which in this case coincide, but
> that's beside the (code)point).

No, no, no, no, that depends on the encoding being used. Yes, numeric
references always refer to Universal Character Set code points,
regardless of the page's encoding, but HTML is not "limited" to that.

See also http://www.xs4all.nl/~rvtol/htmlcods.html which has been
rendered in many different (so non-"standard") ways in the past 10+
years. :)

--
Affijn, Ruud

"Gewoon is een tijger."


Posted by Martijn Lievaart on April 28, 2008, 4:07 pm
Please log in for more thread options
On Mon, 28 Apr 2008 11:40:18 +0200, Dr.Ruud wrote:

> Martijn Lievaart schreef:
>> Dr.Ruud:
>>> RedGrittyBrick:
>>>> szr:
>
>>>>> He's after a ' ', which us a non-breaking space, which is ASCII
>>>>> 0xA0 hex or 160 dec. ' ' can even be re-written as ' ' .
>>>>
>>>> s/ASCII/Unicode/
>>>
>>> Exactly. ISO-8859-* too.
>>
>> No, no, HTML uses Unicode codepoints (which in this case coincide, but
>> that's beside the (code)point).
>
> No, no, no, no, that depends on the encoding being used. Yes, numeric
> references always refer to Universal Character Set code points,
> regardless of the page's encoding, but HTML is not "limited" to that.

No, no, no, no, no :-) You already said it yourself, numeric references
always refer to Unicode codepoints. That's the only point I was trying to
make, and why you cannot substititute ISO-8859-* above.

M4


Posted by Dr.Ruud on April 29, 2008, 6:59 am
Please log in for more thread options
Martijn Lievaart schreef:
> Dr.Ruud:
>> Martijn Lievaart:
>>> Dr.Ruud:
>>>> RedGrittyBrick:
>>>>> szr:

>>>>>> He's after a ' ', which us a non-breaking space, which is
>>>>>> ASCII 0xA0 hex or 160 dec. ' ' can even be re-written as
>>>>>> ' ' .
>>>>>
>>>>> s/ASCII/Unicode/
>>>>
>>>> Exactly. ISO-8859-* too.
>>>
>>> No, no, HTML uses Unicode codepoints (which in this case coincide,
>>> but that's beside the (code)point).
>>
>> No, no, no, no, that depends on the encoding being used. Yes, numeric
>> references always refer to Universal Character Set code points,
>> regardless of the page's encoding, but HTML is not "limited" to that.
>
> No, no, no, no, no :-) You already said it yourself, numeric
> references always refer to Unicode codepoints. That's the only point
> I was trying to make, and why you cannot substititute ISO-8859-*
> above.

Your "No, no," was about a limit I didn't imply, so was not about what I
wrote, but about what you limited it to. ("Fallacy of Distribution")

--
Affijn, Ruud

"Gewoon is een tijger."


Posted by Martijn Lievaart on April 29, 2008, 4:49 pm
Please log in for more thread options
On Tue, 29 Apr 2008 12:59:56 +0200, Dr.Ruud wrote:

[ snip ]

You're right, I'm a bad reader.

M4

Posted by Dr.Ruud on April 30, 2008, 4:43 am
Please log in for more thread options
Martijn Lievaart schreef:
> Dr.Ruud:

> [ snip ]
> You're right, I'm a bad reader.

And I am sorry that I didn't write it clearer.

Have a Happy Queen's Day!

For me a good day to work on some pet projects. And clean the house.
I live near the Amsterdam Museumplein, so I expect loud music from wrong
bands all afternoon and evening.
http://www.koninginnedagamsterdam.nl/Radio-538-Museumplein_348.php

--
Affijn, Ruud

"Gewoon is een tijger."


Similar ThreadsPosted
use WWW::Mechanize; May 11, 2006, 6:28 pm
LWP::UserAgent & Mechanize August 1, 2004, 5:44 am
tricks against WWW::Mechanize April 10, 2005, 6:48 pm
Understanding Mechanize August 19, 2005, 4:23 am
WWW::Mechanize issue November 15, 2005, 7:18 pm
using perl mechanize January 10, 2006, 5:12 pm
selenium with www::mechanize September 12, 2006, 6:52 am
Mechanize location October 8, 2006, 10:17 pm
www::mechanize and forms November 5, 2006, 4:47 pm
WWW::Mechanize question July 5, 2007, 2:37 am

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap