Speed comparison of SELECT statements

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

I've searched all over for this and not found that much useful
information for what I think to be a fairly obvious issue.

I am having to do some benchmark tests on a SELECT statement and
whether it will run faster as a massive JOIN or as a series of smaller

FROM tbl_1
LEFT OUTER JOIN tbl_2 ON tbl_1.ID=tbl_2.ID
LEFT OUTER JOIN tbl_3 ON tbl_1.ID=tbl_3.ID
LEFT OUTER JOIN tbl_4 ON tbl_1.ID=tbl_4.ID
LEFT OUTER JOIN tbl_5 ON tbl_1.ID=tbl_5.ID
 WHERE tbl_1.ID IN (1,2,3,4,5...)


SELECT * FROM tbl_1 WHERE ID IN (1,2,3,4,5...)
SELECT * FROM tbl_2 WHERE ID IN (1,2,3,4,5...)
SELECT * FROM tbl_3 WHERE ID IN (1,2,3,4,5...)
SELECT * FROM tbl_4 WHERE ID IN (1,2,3,4,5...)
SELECT * FROM tbl_5 WHERE ID IN (1,2,3,4,5...)

I've written a script that performs one or the other of the statements,
logs the time and outputs the results.

The problem is that the more you run the tests the faster they get

I've tried using FLUSH TABLES and setting the query cache to 0 but
nothing seems to work, the query times keep dropping.

Does any one have any advice?



Re: Speed comparison of SELECT statements

cedrik wrote:
Quoted text here. Click to load it

Try repeatedly running one of the tests until there is no
further improvement in the time.  Then repeatedly run the other
test until there is no improvement in time.

You will then know the optimum performance that you can expect
for the hardware, os, and database configuration you are using.
  If any of these change you will have to repeat the procedure
since your results may change from release to release.

The catch is that in a production environment you will probably
see less than optimum results.  To compensate you may find it
more realistic to take the initial test time of each of your



Re: Speed comparison of SELECT statements


there might be some other buffers enables, e.g. the key_buffer, which stores
the indexes in memory, or the join buffer or sort buffer a.s.o. This makes
it plausible that the second run will be faster than the first run, but
afterwards it shouldn't get any faster anymore.

However, the 2 queries (if we count the set of queries in the second example
as one) are not absolutely equivalent, so it would of course depend on, what
you really want to do. If the first query is what you want to get, and the
indexes are set correctly (probably primary keys on the ID column), the
first query will probably be the better choice (less roundtrips to the
server, and assuming that indexing is right, it shouldn't (or should hardly)
require more reads than the second example). If you also consider, that the
second query (the set of queries) will probably require some additional work
on the client side, I believe that the first query will give you better


Re: Speed comparison of SELECT statements

Thank you for both your suggestions.

You are right, the JOIN should be the faster but so far doesn't
actually seem to be. In terms of client side work both are equivalent.
The search itself is for a bookstore so all the joins are on the
author, subject, language, etc. tables, each one is liekly to bring
back a couple of rows per book from the primary table. Once the
information is retrieved storing it in arrays and displaying are
virtually identical.

I am getting results from the live server in the region of 5 - 10
seconds for the JOIN but am unable to replicate this under test
conditions (running on the same server I hasten to add).

It seems (after running several thousand iterations in a number of
orders) that the only reliable way I can get any results that are
slightly above average is to wait a couple of hours between tests (this
isn't realistic).

Oh well, the search continues.


Site Timeline