Click here to get back home

sorting a hash / 2008-06-01

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
sorting a hash / 2008-06-01 dn.perl@gmail.com 05-30-2008
Get Chitika Premium
Posted by Uri Guttman on June 12, 2008, 10:53 am
Please log in for more thread options

DS> No not multi level here, just two ways of presenting the data
DS> depending on which $site it came from. Thanks for the help guys, but
DS> they are only variations on what I had tried with no luck. However, I
DS> have discovered that here (OS/2) there is a bug in perl (5.8.2). I
DS> don't know yet if it is a bug in perl or the port. I suspect the
DS> latter, but

DS> foreach my $url (sort [0] <=> $urls[0] }
DS> keys %})

DS> and

DS> foreach my $url (sort by_value keys %})

DS> sub by_value
DS> {
DS> $urls[0] <=> $urls[0];
DS> }

DS> Give different results. The first works correctly and the second for
DS> some reason yet to be determined gets the *wrong* value of $site. I
DS> stuck a print $site in the subroutine. That is where all the errors
DS> came from, it was trying to compare site A's urls against site B's -
DS> No wonder there where a lot of errors :-)

i highly doubt this is a perl bug. my gut feeling is that you have a
scoping problem. the first sort keeps $a and $b inside the sort block
and those will be set correctly. if you lexically declared $a and $b in your
code before the by_value sub, those will be used and screw up your
sort. or maybe a different $site is being used because it is in a
different scope. you need to post more code so we can see the
problem. it isn't just with the above code.

uri

--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------

Posted by Dave Saville on June 12, 2008, 11:19 am
Please log in for more thread options
wrote:

>> i highly doubt this is a perl bug. my gut feeling is that you have
a
> scoping problem. the first sort keeps $a and $b inside the sort block
> and those will be set correctly. if you lexically declared $a and $b in your
> code before the by_value sub, those will be used and screw up your
> sort. or maybe a different $site is being used because it is in a
> different scope. you need to post more code so we can see the
> problem. it isn't just with the above code.

I suspect your are right and I have done something *really* stupid -
but I get the same on a Solaris system. $a & $b are correct and from
the correct $site.

==================
#!/usr/bin/perl

use strict;
use warnings;

my $site;
my %sites;
my %urls;

$sites++;
$sites++;
$sites++;

$urls[0]++; # count
$urls[1] = 'yyyy/mm/dd';
$urls[0]++; # count
$urls[1] = 'yyyy/mm/dd';
$urls[0]++; # count
$urls[1] = 'yyyy/mm/dd';
$urls[0]++; # count
$urls[1] = 'yyyy/mm/dd';
$urls[0]++; # count
$urls[1] = 'yyyy/mm/dd';

foreach $site (keys %sites)
{
print "\nSite: $site\n";

foreach my $url (sort [0] <=> $urls[0]
or $a cmp $b} keys %})
{
print "$site $url $urls[0] $urls[1]\n";
}
}

foreach $site (keys %sites)
{
print "\nSite: $site\n";

foreach my $url (sort by_value keys %})
{
print "$site $url $urls[0] $urls[1]\n";
}
}

sub by_value
{
print "Sort site: >$site< $a $b\n";
$urls[0] <=> $urls[0] or $a cmp $b;
}

==========

Yeilds:

==============
Site: SSL
SSL url_2 1 yyyy/mm/dd
SSL url_3 1 yyyy/mm/dd

Site: DEEZEE
DEEZEE url_2 2 yyyy/mm/dd
DEEZEE url_1 1 yyyy/mm/dd

Site: Web2

Site: SSL
Use of uninitialized value in concatenation (.) or string at try.pl
line 50.
Sort site: >< url_3 url_2
Use of uninitialized value in hash element at try.pl line 51.
Use of uninitialized value in hash element at try.pl line 51.
Use of uninitialized value in hash element at try.pl line 51.
Use of uninitialized value in numeric comparison (<=>) at try.pl line
51.
Use of uninitialized value in numeric comparison (<=>) at try.pl line
51.
SSL url_2 1 yyyy/mm/dd
SSL url_3 1 yyyy/mm/dd

Site: DEEZEE
Use of uninitialized value in concatenation (.) or string at try.pl
line 50.
Sort site: >< url_1 url_2
Use of uninitialized value in hash element at try.pl line 51.
Use of uninitialized value in hash element at try.pl line 51.
Use of uninitialized value in numeric comparison (<=>) at try.pl line
51.
Use of uninitialized value in numeric comparison (<=>) at try.pl line
51.
DEEZEE url_1 1 yyyy/mm/dd
DEEZEE url_2 2 yyyy/mm/dd

Site: Web2

==========

In this case $site has no value in the sub. In the real program it had
one of the other sites as a value rather than the correct one.

--
Regards
Dave Saville

NB Remove nospam. for good email address

Posted by xhoster on June 12, 2008, 11:57 am
Please log in for more thread options
> wrote:
>
> >> i highly doubt this is a perl bug. my gut feeling is that you have
> a
> > scoping problem. the first sort keeps $a and $b inside the sort block
> > and those will be set correctly. if you lexically declared $a and $b in
> > your code before the by_value sub, those will be used and screw up your
> > sort. or maybe a different $site is being used because it is in a
> > different scope. you need to post more code so we can see the
> > problem. it isn't just with the above code.
>
> I suspect your are right and I have done something *really* stupid -
> but I get the same on a Solaris system. $a & $b are correct and from
> the correct $site.
>

>
> my $site;

You should declare variables in the tightest scope you can, otherwise
use strict will be less helpful in finding such scoping problems.

...

> foreach $site (keys %sites)

it would be better to use "foreach my $site"

> {
> print "\nSite: $site\n";
>
> foreach my $url (sort by_value keys %})
> {
> print "$site $url $urls[0] $urls[1]\n";
> }
> }

The way foreach works, the $site inside the loop is not the same
as the $site declared outside the loop.


>
> sub by_value
> {
> print "Sort site: >$site< $a $b\n";
> $urls[0] <=> $urls[0] or $a cmp $b;
> }


The $site used by by_value is the one from outside the foreach loop, not
the one inside the foreach loop. Usually you'd just pass in $site as an
argument, but when the sub is called automatically from sort, that doesn't
work.

There are a variety ways around this, but none of them are entirely
satisfactory. One would be to make the sub an ordinary subroutine,
rather than one made specifically to be used by sort, then invoke
it explicitly rather than implicitly:

foreach my $url (sort keys %})
...
sub by_value
{
my ($a,$b,$site)=@_;
print "Sort site: >$site< $a $b\n";
$urls[0] <=> $urls[0] or $a cmp $b;
}


Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Posted by Dave Saville on June 12, 2008, 12:10 pm
Please log in for more thread options
On Thu, 12 Jun 2008 15:57:37 UTC, xhoster@gmail.com wrote:

> > my $site;
>
> You should declare variables in the tightest scope you can, otherwise
> use strict will be less helpful in finding such scoping problems.
>
> ...
>
> > foreach $site (keys %sites)
>
> it would be better to use "foreach my $site"
>
> > {
> > print "\nSite: $site\n";
> >
> > foreach my $url (sort by_value keys %})
> > {
> > print "$site $url $urls[0] $urls[1]\n";
> > }
> > }
>
> The way foreach works, the $site inside the loop is not the same
> as the $site declared outside the loop.

Ah Ah. I did not know that. Many thanks for pointing it out, that of
course explains everything.

--
Regards
Dave Saville

NB Remove nospam. for good email address

Posted by Peter J. Holzer on June 14, 2008, 3:47 am
Please log in for more thread options
> On Thu, 12 Jun 2008 15:57:37 UTC, xhoster@gmail.com wrote:
>> > my $site;
>>
>> You should declare variables in the tightest scope you can, otherwise
>> use strict will be less helpful in finding such scoping problems.
>>
>> ...
>>
>> > foreach $site (keys %sites)
>>
>> it would be better to use "foreach my $site"

Because that would make it clearer what is happening (see below).

>> > {
>> > print "\nSite: $site\n";
>> >
>> > foreach my $url (sort by_value keys %})
>> > {
>> > print "$site $url $urls[0] $urls[1]\n";
>> > }
>> > }
>>
>> The way foreach works, the $site inside the loop is not the same
>> as the $site declared outside the loop.
>
> Ah Ah. I did not know that. Many thanks for pointing it out, that of
> course explains everything.
>

What Xho didn't point out is that Perl has two different types of
scoping: Dynamic scoping and lexical scoping.

Lexical scoping is introduced with "my" and works like in most other
programming languages: The variable is visible until the end of the
block - as seen in the source code. So:

1 #!/usr/local/bin/perl
2 use warnings;
3 use strict;

4 my $x = 'X';
5 for $x (qw(a b c d)) {
6 foo();
7 }
8 print "$x\n";

9 sub foo {
10 print "$x\n";
11 }

will print

X
X
X
X
X

because the "inner" $x introduced in line 5 is only visible until the closing
brace in line 7, and both the print in line 8 and the print in line 10
will see the "outer" $x introduced in line 4.

If you omit the "my" in line 5, that doesn't change, because loop
variables are always implicitely scoped to the loop.

Now, change line 4 to

4 our $x = 'X';

and run the script again:

a
b
c
d
X

what happened? Now $x is a global variable, and the implicit scoping in
the loop will use dynamic scoping instead of lexical scoping. The $x
inside of the loop is still a different from the $x outside of the loop
(as you can see, the print in line 8 still prints "X"), but "inside of
the loop" now contains subs called from the loop, so the print inside
foo will see the "inner" $x.

Dynamic scoping is used very little these days because it is confusing
and error-prone: You need to know from where a function is called to
know which variables it is seeing. But it does exist and for some
specialized uses it may even be clearer.

        hp


Similar ThreadsPosted
Sorting a hash containing a hash of hashes December 14, 2005, 2:29 pm
Hash Sorting June 14, 2005, 2:49 pm
Sorting Hash by Value and Key May 17, 2007, 9:57 am
Sorting on sub-hash values June 23, 2005, 11:30 am
Sorting AofH over hash key(s)... October 30, 2007, 4:40 pm
Nested sorting of a hash December 6, 2007, 6:23 am
warnings on sorting hash of hashes January 5, 2005, 11:53 pm
sorting data - hash vs. list September 11, 2005, 4:41 pm
Sorting array of hash references October 26, 2006, 6:21 am
Sorting "string" numerical keys from a hash. September 5, 2004, 2:47 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap