regexp to sql wildcard conversion

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

I have a regular expression to filter results from an sql query.
It would help improve the overall efficiency if the result set could
be reduced by  adding constraints to the query based on the regular
For example given the expression /^a (fatal|critical) error has
add "like 'a % error has occurred'" to the sql.
Are there any modules [1] or techniques which can derive an sql
wildcarded string from a regular expression? The regular expression
will still be applied to the results so it doesn't matter that the
wildcard string will match more.

The context is searching a historical database of error messages so
fixed strings with a few embedded variables are the norm.

Apologies if this message appears twice - I first posted it via an
internal news server last week but it doesn't appear to have got out
into the wider world.


Neil Shadrach

[1] I have looked on CPAN but didn't find anything

Re: regexp to sql wildcard conversion (Neil Shadrach) wrote:

Quoted text here. Click to load it

Interestingly (as I often say...) I've got some code at work that could
do with this kind of "best effort" regex->SQL conversion but doesn't yet
do it. I'm assuming your database doesn't support any kind of regexes,
and that it supports the % and _ wildcards.

My idea would be this sort of conversion (pseudocode):

if the first token is a '^'
   remove that token
   sql = '%'
end unless

if the last token is a '$'
   remove that token
   sqlend = '%'
end unless

foreach token in the regex
   if the token is a constant character
      sql .= escaped form of that character
   else if the token represents exactly one unknown character
      sql .= '_'
      sql .= '%'
   end if
end foreach
sql .= sqlend

Now, I haven't tested that at all but it looks like it would do the
right thing. The tricky bit is turning the regex into a string of tokens
- maybe this would be of use there:
(although you might not need to walk the whole regex tree, just to a
depth of one level, to determine what to do to each token?)


pkent 77 at yahoo dot, er... what's the last bit, oh yes, com
Remove the tea to reply

Site Timeline