Advanced Select Problem - Please Help

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
I hope someone can help me with this one.  For performance reasons, I have a
denormalized database.  There are two tables in the database we can call
them table a and table b.  Both of theses tables contain columns that are
text with comma seperated values.


id - integer - primary key
column1 - text
column3 - text
column4 - text

id: 1
column1 : vanilla, chocolate chip, black raspberry
column2 : palm handheld
column3 : oldsmobile alero, chevy corvette, toyota camery
column4 : new jersey, nevada, wyoming, texas, new mexico

id: 2
column1 : vanilla, strawberry
column2 : dell laptop, palm handheld
column3:  toyota camery
column4 : ohio, alaska, hawaii

id - integer - primary key
column1 - text
column3 - text
column4 - text

id: 15
column1 : vanilla, chocolate chip, black raspberry, butter pecan
column2 : dell laptop, hp desktop, palm handheld
column3 : honda civic, chevy corvette, toyota camery
column4 : texas, new mexico, florida

What I need is to create a select statement that returns the id's for table
a  that have at least one data value match in every column in table b.

The above example would return the id of 1 from table one because :

column1 matches on: vanilla and chocolate chip and black raspberry
column2 matches on: palm handheld
column3 matches on: chevy corvette and toyota camery
column4 matches on: texas and new mexico

id 2 would not be returned because there are no matches on column 4

The problem is that I am running MySQL version 3.23.58 so I can't use
subqueries or boolean match against commands.

Can anyone think of away to do this without creating a select statment that
has a million LIKE %___%


Re: Advanced Select Problem - Please Help

Quoted text here. Click to load it
a  denormalized database.  ....

Please clarify this.

If the database is deliberately denormalized, there are only 2 reasons I can
think of.  Either -
  A) You don't understand database normalization.
  B) You want the system to perform as poorly as possible.

If neither applies, then you should certainly consider abandoning the use of
a relational database system like MySQL and opt, instead, for a system
designed from the ground up to work with "denormalized" databases.


If it is B -  I can show you some really bad queries consistent with your
stated design goal.
Thomas Bartkus

Re: Advanced Select Problem - Please Help

When I say denormalized - I mean it is not normalized out to 3rd normal

Re: Advanced Select Problem - Please Help

On 14/09/2005, Mark wrote:

Quoted text here. Click to load it

It isn't even in 1NF.


Re: Advanced Select Problem - Please Help

MJunium wrote:
Quoted text here. Click to load it

How did you measure the performance that leads you to believe that
storing it in this non-normalized form has a benefit?  How much of a

What I'm getting at is that many software engineers _assume_ that
they'll get a performance benefit by doing something, without actually
measuring it to see if that assumption is true, or how much of a
benefit/penalty there is relative to other techniques.  Statements like
"it's obvious that..." or "it stands to reason that..." are the same as
making an assumption, unless you have performance measurements to
support the statement.

So if you get a 0.0001% performance gain, but it takes 2000% more code
complexity to achieve, is that worth the tradeoff?  If not, bring those
two values closer together.  At what threshold is it worth the tradeoff?
  Does your given case fall within this threshold?

The correct way to do the problem you're describing is to normalize
those values.  Define a separate table for each of column1...column4.

For instance:
CREATE TABLE flavors (
   a_id integer references table_a(id),
   flavor varchar(64),
   primary key (a_id, flavor)

Do the same for table b.
Now you can find the matches as follows:

FROM table_a AS a
  INNER JOIN flavors AS f ON = f.a_id
  INNER JOIN computers AS c ON = c.a_id
  INNER JOIN table_b AS b
  INNER JOIN b_flavors AS bf ON ( = bf.b_id AND f.flavor = bf.flavor)
  INNER JOIN b_computers AS bc ON ( = bc.b_id AND =
WHERE = 15

I predict that a join like this actually performs much better than the
huge expressions that would be required by the denormalized design.

Bill K.

Re: Advanced Select Problem - Please Help

You hit the nail on the head with the code vs performance gain.  The
reason this was not normalized is that there are more like 15 different
columns not just the 4.  The query time to do all the table joins was

Site Timeline