find and match javascript code present in any page

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View


I have an application where i need to parse a URL and matched if a
javascript code is there or not.
If the code is commented or modified i should throw an alert. Is this
possible to find out if a js code
is commented out or not ? So far what i am able to do is

get the page content as a string using function file_get_contents
( URL ) and then apply a reg exp
( i am using fn eregi ) to get all the javascript code available in
that page and then Apply strpos function
to find the occurance of the code in that matched element ( this may
not be the best approcah n sound
stupid ). The code may look like this :

$content = file_get_contents( URL );
$pageCode = "<script language='javascript'> some multiline javascript
code here </script>";
$pattern = "matching pattern" ( i do not know much about regexp what i
use looks like : <script[^>]*>.*</script> )

if( eregi($pattern, $content, $match) ) {
    if( strpos($match[0], $pageCode) ) {
        print "found \n";
    } else {
        print "not found \n";

This may sound very stupid, also i am not sure about the exact
Any help will be highly appreciated.


Re: find and match javascript code present in any page

killerBird escribió:
Quoted text here. Click to load it

The eregi function is deprecated and may be removed in future PHP
versions. If you need to use regular expression, use the preg_*
functions instead.

Whatever, you can parse HTML with the builtin DOM functions. I'm not an
expert but this sample code could be a starting point:


$html = <<<EOM

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
" ">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<script type="text/javascript"><!--
alert("Foo Bar");

<p>Lorem ipsum dolor sit amet</p>
<p>Consectetur adipisicing elit.</p>



$dom = new DOMDocument;
$xml = simplexml_import_dom($dom);
foreach($xml->xpath('//script') as $i){
    echo $i->asXML() . "\n";


Of course, it needs some tweaking because real HTML is seldom valid.

Once you've extracted the JavaScript code, you need to parse it, which
is a different problem. There seem to be some third party libraries but
I haven't used them myself: /

Of course, if the code you need to match is not very complex, a simple
regexp might do the trick.

-- - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web:
-- Mi web de humor satinado:

Site Timeline