Syntax Highlighting using XML and regex

Hello all-

  I'm developing a Content Management software for my own site, and
possibly package and deploy to other sites (for friends, family, etc.)
The content management software is combined blog, photo, and site
management tools.  One the of the tools I would find INREDIBLY helpful
is the ability to block out and syntax highlight code.  I would like to
be able to, as I'm writing a blog post about something I've done in
Python (for example), just stop and type something like:

def foo():



  Do you get the idea?  The only code I've been writing recently is
PHP, Python, and C++, so I could start with the regex for those
languages first and then expand.  The blocks of code would then add
some style info (like making the font a fixed size font, etc....) and
then would highlight certain terms (a regex nightmare).

  I've talked this over with another PHP dev, and he suggested I use
the xml_parser_create function to generate a parser object, and then
the xml parsing functions to find these tidbits of code.  This sounds
like a great idea to me.  However, does anyone have suggestions on a
better way to do this, or an efficient way to do the syntax

  I'm assuming for syntax highlighting, I'd have to do a massive
preg_replace with the regex, and then add in the HTML for items I
wanted to highlight.  I don't mind doing this, but I was also wondering
if someone had a great resource for syntax highlighting regex that
someone else may have created that I could use, and possibly modify.

  Licensing for this software would be the GPL, so obviously everything
else should be GPL.  I thought I'd bounce the idea against some other
PHP developers, and see what they thought of it.

Thanks in advance

Re: Syntax Highlighting using XML and regex

rockstar_ wrote:

Does highlight_file() / highlite_string() suit your purposes? (See PHP
manual for more info).


Re: Syntax Highlighting using XML and regex

 What happens when what's inside the tags isn't valid XML?

 As for syntax highlighting, I use GeSHi - it's not the fastest, but it has a
very wide range of supported language without having to write it all again. (GPL license)

