web crawler

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
Hai ,

 i need to get latest news from some websites which have no rss feed
.but i need to display it in to my website. how can i get the content
from that web page. how can i identify only the news content .can i do
it automatically? so search each day for the websites for latest news .
could i use any search engine to search a particular  website ?  help
me ....

Re: web crawler

You can use wget on a crontab to download the page. Then use sed/awk to
parse out the values for news - then store them in mysql.

Then just get your normal dynamic php content generators to output the
value from a database.

Site Timeline