|
Posted by saqib ali on November 18, 2004, 2:51 pm
Please log in for more thread options
Hello All,
I manage a rather large website, that has several hundred content
managers. These content managers can create links at their will. I want
to take a look a list of all the Links that are on our website.
Is there a utility that can generate a list (text based) of all the
URLs that are mentioned on our website?
Thanks.
Saqib Ali
http://validate.sf.net <--- DocBook XML / XHTML Validator
|
|
Posted by Geoff Muldoon on November 18, 2004, 11:42 pm
Please log in for more thread options
rumionfire@gmail.com says...
> I manage a rather large website, that has several hundred content
> managers. These content managers can create links at their will. I want
> to take a look a list of all the Links that are on our website.
>
> Is there a utility that can generate a list (text based) of all the
> URLs that are mentioned on our website?
What platform?
If on Linux I'd recommend:
http://htcheck.sourceforge.net/
Geoff M
|
|
Posted by saqib ali on November 18, 2004, 4:04 pm
Please log in for more thread options windows would be preferrable.
i don't want a elaborate link checker. I just want a simple console
based app that i can i run on a nightly basis, that generate a text
file with all the links on my website. I need to pass that text file on
a C++ program that I wrote.
Thanks.
Saqib Ali
|
|
Posted by Alan J. Flavell on November 19, 2004, 12:11 am
Please log in for more thread options On Thu, 18 Nov 2004, saqib ali wrote:
> windows would be preferrable.
Xenu link checker produces something along the lines that you're
describing.
> i don't want a elaborate link checker. I just want a simple console
> based app that i can i run on a nightly basis, that generate a text
> file with all the links on my website.
It's not quite what you want, but it might be worth looking at
nevertheless.
Have you considered lynx (available in a win32 version), which has
various site-exploring options that can be invoked as a batch job?
|
|
Posted by SimonFx on November 19, 2004, 6:38 am
Please log in for more thread options A simple perl script could do this easy.
Even grep, if you have the latest grep plus the extra pain in the butt
DLLs you need to download for windows.
Something like:
grep -ior "href=[^>]*" c:internetwww*.html > links.txt
or possibly:
grep -ior href="[^"]* c:internetwww*.html > links.txt
Hmmm, if you want to create a historic log, create a batch file with:
for /f "tokens=1-4 delims=/.- " %%A in ('date /t') do SET FN=%%D%%C%%B
grep -ior "href=[^>]*" c:internetwww*.html > %FN%.log
Grep + DLLs (libintl, libiconv, pcre) available from
http://sourceforge.net/project/showfiles.php?group_id=23617
Sourceforge grep is a bit buggy - but the only one I know of that has
the "-o" option (not show whole line, just the text that matches the regexp)
Perl code would be cleaner, more reliable. Pay a perl programmer $40 to
write it or buy an old perl book from the bargain bin at your local
bookshop and write it yourself.
|
| Similar Threads | Posted | | website link problem in css | August 15, 2005, 12:29 pm |
| html link from browser link to xml editor | September 9, 2004, 5:53 am |
| Is this website ok | May 13, 2006, 1:04 am |
| How is this website made???? | August 26, 2004, 6:30 am |
| website hosting | November 29, 2004, 7:17 am |
| website hosting | January 3, 2005, 2:29 pm |
| Validating old website (W3C) | July 7, 2005, 1:07 pm |
| Need code for website. | January 23, 2006, 7:30 pm |
| Using the W3schools website | January 8, 2007, 6:20 pm |
| website review | January 22, 2007, 8:59 pm |
|