regexp question

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
I have a regalar expression to extract an html link from a page:

href=([\"']?)([^>\1]*\.html)\1(?: [^>]*)?>

It looks after the "href" for an optional quote and then looks for something
that is not the quote or the endarrow.

The problematic part is [^>\1]*. It should exclude anything with the quote,
but somehow that doesn't work. Maybe \1 is not allowed inside brackets?
I would like some advice on how to handle this.


Re: regexp question

Wim Roffal wrote:

Quoted text here. Click to load it

Use an HTML parser.

Quoted text here. Click to load it

Back references aren't recognised in character classes.

[ ... ]



Site Timeline