searching rows containig a substring fast

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
I am creating a server indexing files in my local area network, in
order to provide a searching feature.
So i want to make it possible to searchsuch rows where the 'name'
(VARCHAR) column contains a substring submitted by user.
an i use a full-text index to make it faster? from my first sight i
noticed that this index can be used to search for words in a string. So

the example below would not go:
     searching 'cool' in 'uncoolness'
Am i right? Are there any other solutions? Maby some other databases
provide this use case?

Re: searching rows containig a substring fast

If you want to search for text strings that appear within a word, use
the wildcard character %. For example

SELECT * FROM tablename WHERE columname LIKE "%cool%"

would find records that contain the word cool even if the word has
other characters around it. The wildcard character % seems to be
equivalent to .* in regular expressions - it matches zero or more
instances of any possible character. You will (naturally) need to
replace 'tablename' and columnname with the name of the table and
column that you want to search.


Re: searching rows containig a substring fast

That's a toughie.  A regular index won't let you to substring searches
(you have to start at the beginning), so it won't work.

If you're using InnoDB, a fulltext won't work either It's a MyISAM

If you decide to use a FULLTEXT index, make sure that you have enough
rows in your table (at least 3) - and keep in mind that unless you do a
boolean search the 50% rule will apply (which means if someone searches
for names containing the letters 'an' and 'an' appears in more than 50%
of the rows, they'll get no results)

If you're doing a whole lot of this and are very interested in
performance, I'd recommend you look at lucene - it's performance is
much better than MySQL's fulltext.

Also, be aware that by default FULLTEXT will only index strings that
are longer than 3 characters.  You might want/need to change that if
you are indexing names (since you want "Tom", "Ed", and "Bo" to be
indexed - or whatever 3 character file names that you have).

Site Timeline