|
Posted by Peter Wyzl on July 20, 2007, 9:05 pm
Please log in for more thread options
> I'm afraid I'm a bit of a newbee when it comes to Perl,
> though I have some experience with other languages
> (mostly C++).
>
> I would like to make a script to automate the downloading
> some pages on the Web, and thought Perl should be
> suitable for this. However, I'll undoubtfully need some
> modules, and I have no idea of which ones... So I would
> appriciate suggestions to what modules I may need and
> should take a closer look at.
>
> I'm planning on making something similar to 'wget', but
> specialized to the type of pages I want; so it will mostly
> be a matter of downloading web-pages, saving them,
> and parsing them for links to other web-pages to download.
> I may also need to save other page contents (e.g. images),
> and maybe event content refered to by CSS (e.g. background
> images). Many of the pages I'm after are PHP-pages (but
> AFAIK that is handled on the server-side, isn't it).
>
> Some of the pages require log-in, so an ability for the script
> to recognize a password-form, fill-in user-name and
> password and post it -- as well as accepting cookies -- are
> needed too. Pages containing just a confirmation-button
> for proceding, may also need to be "pushed" by the script.
> There may also be need to fill-in and send forms with things
> like date-of-birth -- maybe also in the form of drop-down lists.
> Many of these are redirects; e.g. I want a page with text, but
> unless I've previously logged-in, specified dob or confirmed,
> I'm redirected to forms. After I've filled in the form, I procede
> to the page I wanted. However -- at least in my browser -- these
> pages (the one I want and the one I need to fill stuff in on) seem
> to have the same URL and be "identical" from the browsers pov.
>
> Some limited emulation of JavaScript would also be great. E.g.
> the ability to "fake" a pop-up dialog-box and "press" "OK" or
> "Yes"; for posting some forms; and for redirecting.
>
> So any idea for modules I ought to look at for accomplising
> some or all of the above, would be very much appriciated.
Big job... start with LWP modules which are installed as part of Perl. That
will in turn lead to to many others that will possibly be helpful, cookies
etc.
Also search CPAN http://www.cpan.org/ for various other things you need.
P
|