Click here to get back home

lwp-download http://..--how do I use it to download pages?

 HomeNewsGroups | Search | About
 comp.lang.perl.modules    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
lwp-download http://..--how do I use it to download pages? bdy120602 06-08-2008
Posted by bdy120602 on June 8, 2008, 5:47 pm
Please log in for more thread options
I have 822 Web pages to download (so 822 URLs). How do I use lwp-
download to capture more than one page? Also, I want to save to a
specific location as well as determine the file format. Can all this
be done lwp? If not, how can it be done in Perl, if at all.

Thanks.

Posted by Peter Scott on June 9, 2008, 1:27 am
Please log in for more thread options
On Sun, 08 Jun 2008 14:47:20 -0700, bdy120602 wrote:
> I have 822 Web pages to download (so 822 URLs). How do I use lwp-
> download to capture more than one page? Also, I want to save to a
> specific location as well as determine the file format. Can all this
> be done lwp?

Yes. Loop 822 times. If that takes too long, look into LWP::Parallel.
MIME type can be retrieved with the content_type() method of the response.
See the LWP documentation.

--
Peter Scott
http://www.perlmedic.com/
http://www.perldebugged.com/


Posted by bdy120602 on June 10, 2008, 1:00 pm
Please log in for more thread options
> On Sun, 08 Jun 2008 14:47:20 -0700, bdy120602 wrote:
> > I have 822 Web pages to download (so 822 URLs). How do I use lwp-
> > download to capture more than one page? Also, I want to save to a
> > specific location as well as determine the file format. Can all this
> > be done lwp?
>
> Yes. =A0Loop 822 times. =A0If that takes too long, look into LWP::Parallel=
.
> MIME type can be retrieved with the content_type() method of the response.=

> See the LWP documentation.
>
> --
> Peter Scotthttp://www.perlmedic.com/http://www.perldebugged.com/

OK, I'm all set with my original query: I ran all my requests to
download files from a URL separately with "call" as a prefix and
putting it in a batch file; however, I would like to create a log file
so I can view any errors that might have occured ( I plan to run this
nightly). Is it possible to have this report e-mailed to me? If so,
how? Also, would you point me in the direction of a resource to use
lwp-mirror. I'm just a little less than a novice.

Thanks for your help thus far.

Posted by Peter Scott on June 13, 2008, 8:26 am
Please log in for more thread options
On Tue, 10 Jun 2008 10:00:37 -0700, bdy120602 wrote:

>> OK, I'm all set with my original query: I ran all my requests to
> download files from a URL separately with "call" as a prefix and
> putting it in a batch file; however, I would like to create a log file
> so I can view any errors that might have occured ( I plan to run this
> nightly). Is it possible to have this report e-mailed to me? If so,
> how?

Try Email::Send. However, in the situation you describe, I usually run
the job under cron in my account and anything it outputs will be emailed
to me anyway, thus making things much easier, if I can do it that way.

> Also, would you point me in the direction of a resource to use
> lwp-mirror. I'm just a little less than a novice.

http://search.cpan.org/perldoc?lwp-mirror
http://search.cpan.org/~gaas/libwww-perl-5.812/lwpcook.pod#MIRRORING

--
Peter Scott
http://www.perlmedic.com/
http://www.perldebugged.com/


Posted by bdy120602 on June 13, 2008, 2:35 pm
Please log in for more thread options
> On Tue, 10 Jun 2008 10:00:37 -0700, bdy120602 wrote:
> >> OK, I'm all set with my original query: I ran all my requests to
> > download files from a URL separately with "call" as a prefix and
> > putting it in a batch file; however, I would like to create a log file
> > so I can view any errors that might have occured ( I plan to run this
> > nightly). Is it possible to have this report e-mailed to me? If so,
> > how?
>
> Try Email::Send. =A0However, in the situation you describe, I usually run
> the job under cron in my account and anything it outputs will be emailed
> to me anyway, thus making things much easier, if I can do it that way.
>
> > Also, would you point me in the direction of a resource to use
> > lwp-mirror. I'm just a little less than a novice.
>
> http://search.cpan.org/perldoc?lwp-mirrorhttp://search.cpan.org/~gaas/libw=
ww-perl-5.812/lwpcook.pod#MIRRORING
>
> --
> Peter Scotthttp://www.perlmedic.com/http://www.perldebugged.com/

Cool. I used the info. you gave me to figure out but I did it without
the cron. One more question. Does Perl have a module that you know of
that can scan an address for new pages, that is, not pages that have
new content, but pages that are new. For example, a Web page currently
has ten pages. http://www.page.com/one.jsp, http://www.page.com/two.jsp,
http://page.com/three.jsp, etc., until ten. Is there a way for Perl,
using a module or otherwise, to scan the URL for an eleventh page?
Also, let's assume that I do not know the file name. For example,
eventhough I would be expecting eleven.jsp, it's possible that it will
be named in this path: http://www.page.com/numbers/index/format.jsp

Thanks again.

Similar ThreadsPosted
Looking for modules to help downlaod web-pages... July 20, 2007, 6:53 pm
LWP::Simple::get returns undef for some Web pages December 26, 2004, 12:54 pm
Directory for site-packaged module man pages December 15, 2004, 2:55 pm
Recursive download from the web February 4, 2005, 1:12 am
I want an perl module for conver large html page file to multi little pages November 14, 2004, 3:02 am
download all CPAN modules ? August 25, 2004, 10:35 pm
where can download Filter::netcrypt February 22, 2005, 9:55 pm
How to download web sites with www::robot April 11, 2005, 2:31 am
RFC: Catalyst::View::Download March 5, 2008, 1:36 pm
download file from windows webserver box using LWP November 3, 2005, 4:31 am

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap