|
Posted by Mumia W. (reading news) on December 5, 2006, 9:32 pm
Please log in for more thread options
On 12/05/2006 04:31 PM, Uri Guttman wrote:
writes:
>
> MW(n> On 12/05/2006 01:38 PM, Uri Guttman wrote:
> >> [...]
> >> my $cost_order = 'A2yB';
> >> my $sorter = make_sorter(
> >> 'GRT',
> >> init_code => "my $cost_order = '$cost_order' ;",
> >> signed => 1,
> >> string_data => 1,
> >> number => q{ /code="(.)"/ && index($cost_order,$1) },
> >> ) ;
> >> [...]
>
> MW(n> It's a little more interesting if a CODE reference is used since only
> MW(n> a CODE ref is deparsed:
>
> MW(n> number => sub { /code="(.)"/ && index($cost_order,$1) },
>
> MW(n> But it does work the same way.
>
> did you try that?
Yes.
> when i do i get that $cost_order is not declared in
> the internal eval.
It's defined. "My $cost_order = '$cost_order' ;" defines the variable
in the 'init_code.'
> the dumped source code shows that.
When I run the program, the source dumps itself, and I see where
$cost_order is defined in the internal eval before it's used.
> the reason i
> didn't wan't to support just calling the code refs was for speed. the
> code ref would need to be called for each sort element and that is a
> sizeable extra overhead compared to running the code inline as it does
> now. maybe an option can be added to call the coderef instead of
> deparsing it so you can use closures for key extractions. shouldn't be
> too hard to do. it would be 1 option to parse, pod for it, and a small
> conditional and code generation change. wanna tackle it? :)
>
> uri
>
Maybe I'll look into that if the source for Sort::Maker isn't too
complicated. It sounds easy, but aren't those famous last words? :-)
--
paduille.4060.mumia.w@earthlink.net
|
|
Posted by Uri Guttman on December 5, 2006, 11:40 pm
Please log in for more thread options
writes:
MW(n> On 12/05/2006 04:31 PM, Uri Guttman wrote:
writes:
>> >> 'GRT',
>> >> init_code => "my $cost_order = '$cost_order' ;",
>> >> number => q{ /code="(.)"/ && index($cost_order,$1) },
>> >> ) ;
>> >> [...]
>> MW(n> number => sub { /code="(.)"/ && index($cost_order,$1) },
MW(n> Yes.
>> when i do i get that $cost_order is not declared in
>> the internal eval.
MW(n> It's defined. "My $cost_order = '$cost_order' ;" defines the variable
MW(n> in the 'init_code.'
ah. i didn't see you kept the init_code. i dropped it and tried the pure
closure idea and it failed. i did so a little side test and you can
definitely pass a closure into a string eval so this is doable. you need
to declare a var (or array since more than one can be used) in the eval
text before the sub. those lexicals (could be multiple lexicals with
sequential number names as that will be faster at runtime. good cheat
when generating code).
>> the reason i
>> didn't wan't to support just calling the code refs was for speed. the
>> code ref would need to be called for each sort element and that is a
>> sizeable extra overhead compared to running the code inline as it does
>> now. maybe an option can be added to call the coderef instead of
>> deparsing it so you can use closures for key extractions. shouldn't be
>> too hard to do. it would be 1 option to parse, pod for it, and a small
>> conditional and code generation change. wanna tackle it? :)
>> uri
>>
MW(n> Maybe I'll look into that if the source for Sort::Maker isn't too
MW(n> complicated. It sounds easy, but aren't those famous last words? :-)
the guts of generating the GRT version is the toughest but this is key
extraction which is shared by all four sort styles. i have the design
worked out in my head so if you want to take it on, i can send you some
notes and what needs to be done. it should be fairly simple after
that. you will get full credit for instigating this feature and for
implementing it. of course you need to add test cases as well but the
test setup is table driven so that is easy. email me if you want to take
this on.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
|
|
Posted by Mumia W. (reading news) on December 6, 2006, 10:23 am
Please log in for more thread options
On 12/05/2006 10:40 PM, Uri Guttman wrote:
>
> the guts of generating the GRT version is the toughest but this is key
> extraction which is shared by all four sort styles. i have the design
> worked out in my head so if you want to take it on, i can send you some
> notes and what needs to be done. it should be fairly simple after
> that. you will get full credit for instigating this feature and for
> implementing it. of course you need to add test cases as well but the
> test setup is table driven so that is easy. email me if you want to take
> this on.
>
> uri
>
Sure, send me the info. Here is what I've written for the POD for the
proposed feature addition:
> General Options
> [...]
> "keep_code"
>
> This tells Sort::Maker to not deparse the CODE reference you give it
> but instead to use it directly. This probably disables some opportuni-
> ties for optimization of the key extraction code, but it's necessary
> for situtations where your key extraction code must access lexical
> variables.
>
My e-mail address is in my sig and in the headers.
(posted and mailed)
--
paduille.4060.mumia.w@earthlink.net
|
|
Posted by Gunnar Hjalmarsson on December 5, 2006, 6:07 pm
Please log in for more thread options
Since this discussion started with this grep() solution:
http://groups.google.com/group/comp.lang.perl.misc/msg/f02dee998cbac6d6
i thought I'd do a benchmark. This is a typical result on my WinXP box
for a 4 elements list:
Rate S-Maker grep()
S-Maker 2620/s -- -83%
grep() 15769/s 502% --
Break even as regards list size (besides the time for compiling
Sort::Maker) seems to be around 20 elements:
Rate grep() S-Maker
grep() 1833/s -- -9%
S-Maker 2014/s 10% --
This is the benchmark code:
use Sort::Maker;
use Benchmark 'cmpthese';
sub build_data {
my @r_order =
map { ('_', 0..9, 'A'..'Z', 'a'..'z')[rand 63] } 1..shift;
my @unsorted =
map qq|<table><tr><td meascode="$_"></td></tr></table>|,
@r_order;
my @c_order;
push @c_order, splice( @r_order, rand @r_order, 1 )
while @r_order;
join( '', @c_order ), @unsorted
}
my ($cost_order, @unsorted) = build_data(20);
cmpthese( -5, {
'S-Maker' => sub {
my $sorter = make_sorter(
'GRT',
init_code => "my $cost_order = '$cost_order';",
signed => 1,
string_data => 1,
number => q{ /code="(.)"/ && index($cost_order,$1) },
);
my @sorted = $sorter->( @unsorted );
},
'grep()' => sub {
my $co = $cost_order;
my @sorted = ();
while ( my $mc = chop $co ) {
unshift @sorted, grep /code="$mc"/, @unsorted;
}
},
} );
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
|
|
Posted by Uri Guttman on December 5, 2006, 6:29 pm
Please log in for more thread options
GH> Since this discussion started with this grep() solution:
GH> http://groups.google.com/group/comp.lang.perl.misc/msg/f02dee998cbac6d6
GH> i thought I'd do a benchmark. This is a typical result on my WinXP box
GH> for a 4 elements list:
GH> Rate S-Maker grep()
GH> S-Maker 2620/s -- -83%
GH> grep() 15769/s 502% --
GH> Break even as regards list size (besides the time for compiling
GH> Sort::Maker) seems to be around 20 elements:
GH> Rate grep() S-Maker
GH> grep() 1833/s -- -9%
GH> S-Maker 2014/s 10% --
that makes perfect sense. the win with the GRT is with larger input
sets. with only 4 elements the overhead will be slower than the actual
sort time. even with your code's O( N ** 2) growth it still does less
actual work for only 4 elements.
GH> This is the benchmark code:
GH> use Sort::Maker;
GH> use Benchmark 'cmpthese';
GH> sub build_data {
GH> my @r_order =
GH> map { ('_', 0..9, 'A'..'Z', 'a'..'z')[rand 63] } 1..shift;
GH> my @unsorted =
GH> map qq|<table><tr><td meascode="$_"></td></tr></table>|,
GH> @r_order;
GH> my @c_order;
GH> push @c_order, splice( @r_order, rand @r_order, 1 )
GH> while @r_order;
GH> join( '', @c_order ), @unsorted
GH> }
GH> my ($cost_order, @unsorted) = build_data(20);
GH> cmpthese( -5, {
GH> 'S-Maker' => sub {
GH> my $sorter = make_sorter(
GH> 'GRT',
GH> init_code => "my $cost_order = '$cost_order';",
GH> signed => 1,
GH> string_data => 1,
GH> number => q{ /code="(.)"/ && index($cost_order,$1) },
GH> );
GH> my @sorted = $sorter->( @unsorted );
ahh, you are including the time to build the sort! factor that out (do
it before you call the benchmark) and then in here pass a wrapper sub
that calls the sorter on the data. you can't compare generating and
compiling the sorter sourse to your precompiled code.
this is why benchmarking is a skill unto itself. you have to know what
you are comparing and how to properly isolate the code under comparison.
sort::maker was designed to build a sorter once and reuse it many
times. you can either use the code ref or dump the source and cut/paste
it into your code and use it there. but that will break if i add the
closure feature as you can't paste in the private data from runtime. oh
well.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
|
| Similar Threads | Posted | | ANNOUNCE: Sort::Maker .02 | September 2, 2004, 5:09 am |
| Accomodate for poor db design using Sort::Maker? | December 9, 2006, 12:48 am |
| Sort::Maker: style => 'plain' difficulty | December 14, 2006, 4:35 am |
| Sort::Maker: (Notes) The plain and the orcish don't include the "init_code" | December 14, 2006, 7:32 am |
| Pre-compiled SSL Perl modules | August 6, 2006, 7:11 am |
| Using Set::Object - evidently I need compiled code? | March 31, 2008, 10:21 pm |
| [ANN] Sort::Key 0.02 | April 28, 2005, 11:02 am |
| Compiled .exe of Perl does not work when using Crypt::SSLeay | May 14, 2007, 7:35 pm |
| Running compiled Inline C perl scripts on more than one machine | February 9, 2006, 12:12 pm |
| unusual behaviour of perl script compiled with perlapp on hp-ux11i | May 8, 2005, 7:52 am |
|