Click here to get back home

Speeding my script

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
Speeding my script Petyr David 02-22-2008
Get Chitika Premium
Posted by Jamie on February 23, 2008, 3:07 am
Please log in for more thread options
>have a web page calling PERL script that searches for patterns in 20,
>000 files + and returns link to files and lines found matching
>pattern. I use a call to `find` and `egrep`

That is going to take a long, long time.

>Q: Script works - but is straining under the load - files are in the
>Gbs.
> How to speed process? How simple to employ threads or slitting
>off
> new processes?

Thats an option. Check into File::Find, fork() and pipes. You could
create some pipes, fork several processes, do a select on the handles
and run the commands in parallel.

This will still run awfully slow though.

>what I'd like to do is to be able to simultaneously be searching more
>than 1 subdirectory

If you don't need full regex capability, you could check into indices. If you
know one of the words, you can use that to filter out which documents to scan.

If you can get the words sorted, look into Search::Dict (or, use a tied hash)

Best bet is to use an index though. Even if it's crude, a substantial amount
of your time is probably spent opening and closing files. (well, find/grep
anyway)

An example of a "crude index" is the whatis database.

When you type 'apropos keyword' you're not opening a zillion manpages and
scanning them.

Jamie
--
http://www.geniegate.com Custom web programming
Perl * Java * UNIX User Management Solutions

Posted by Petyr David on February 25, 2008, 11:03 am
Please log in for more thread options
On Feb 23, 3:07 am, nos...@geniegate.com (Jamie) wrote:
>
> >have a web page calling PERL script that searches for patterns in 20,
> >000 files + and returns link to files and lines found matching
> >pattern. I use a call to `find` and `egrep`
>
> That is going to take a long, long time.
>
> >Q: Script works - but is straining under the load - files are in the
> >Gbs.
> > How to speed process? How simple to employ threads or slitting
> >off
> > new processes?
>
> Thats an option. Check into File::Find, fork() and pipes. You could
> create some pipes, fork several processes, do a select on the handles
> and run the commands in parallel.
>
> This will still run awfully slow though.
>
> >what I'd like to do is to be able to simultaneously be searching more
> >than 1 subdirectory
>
> If you don't need full regex capability, you could check into indices. If you
> know one of the words, you can use that to filter out which documents to scan.
>
> If you can get the words sorted, look into Search::Dict (or, use a tied hash)
>
> Best bet is to use an index though. Even if it's crude, a substantial amount
> of your time is probably spent opening and closing files. (well, find/grep
anyway)
>
> An example of a "crude index" is the whatis database.
>
> When you type 'apropos keyword' you're not opening a zillion manpages and
> scanning them.
>
> Jamie
> --http://www.geniegate.com Custom web programming
> Perl * Java * UNIX User Management Solutions

> If you don't need full regex capability, you could check into indices. If you
> know one of the words, you can use that to filter out which documents to scan.

but I do. I've considered, and will install Swish-e. Would i not be
able to use regexes with something like Swishe-e?

Similar ThreadsPosted
speeding up perl script execution under apache October 29, 2004, 5:29 pm
Speeding up February 19, 2006, 2:38 am
Speeding up glob? April 25, 2005, 2:34 pm
Speeding up writes to STDOUT June 4, 2006, 11:55 pm
Thoughts on speeding up PDF::API2 September 11, 2008, 8:11 pm
Speeding up an application - general rules December 21, 2006, 10:13 pm
Re: Need ideas on how to make this code faster than a speeding turtle May 15, 2008, 6:16 pm
Re: Need ideas on how to make this code faster than a speeding turtle May 16, 2008, 3:17 am
Re: Need ideas on how to make this code faster than a speeding turtle May 16, 2008, 3:54 am
Re: Need ideas on how to make this code faster than a speeding turtle May 16, 2008, 7:57 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap