10 sql tips to speed up your database

by Jean. 60 Comments -

On most websites, content is stored in a database and served to visitors upon request. Databases are very fast, but there’s lots of things that you can do to enhance its speed and make sure you won’t waste any server resources. In this article, I have compiled 10 very useful tips to optimize and speed up your website database.

Design your database with caution

This first tip may seems obvious, but the fact is that most database problems come from badly-designed table structure.
For example, I have seen people storing information such as client info and payment info in the same database column. For both the database system and developers who will have to work on it, this is not a good thing.
When creating a database, always put information on various tables, use clear naming standards and make use of primary keys.
Source: http://www.simple-talk.com/sql/database-administration/ten-common-database-design-mistakes/

Know what you should optimize

If you want to optimize a specific query, it is extremely useful to be able to get an in-depth look at the result of a query. Using the EXPLAIN statement, you will get lots of useful info on the result produced by a specific query, as shown in the example below:

EXPLAIN SELECT * FROM ref_table,other_table WHERE ref_table.key_column=other_table.column;

Source: http://dev.mysql.com/doc/refman/5.0/en/using-explain.html

The fastest query… Is the one you don’t send

Each time you’re sending a query to the database, you’re using a bit of your server resources. This is why, on high traffic sites, the best thing you can do in order to speed up your database is to cache queries.

There’s lots of solutions to implement a query cache on your server. Here are a few:

  • AdoDB: AdoDB is a database abstraction library for PHP. It allows you to use the database system of your choice (MySQL, PostGreSQL, Interbase, and way much more) and it is designed for speed. AdoDB provides a simple, yet powerful caching system. And last but not least, AdoDB is licenced under the BSD, which means that you can use freely on your projects. A LGPL licence is also available for commercial projects.
  • Memcached: Memcached is a distributed memory caching system which is often used to speed up dynamic database-driven websites by alleviating database load.
  • CSQL Cache: CSQL Cache is an open-source data caching infrastructure. Never tested it personally, but it seems to be a great tool.

Don’t select what you don’t need

A very common way to get the desired data is to use the * symbol, which will get all fields from the desired table:

SELECT * FROM wp_posts;

Instead, you should definitely select only the desired fields as shown in the example below. On a very small site with, let’s say, one visitor per minute, that wouldn’t make a difference. But on a site such as Cats Who Code, it saves a lot of work for the database.

SELECT title, excerpt, author FROM wp_posts;

Use LIMIT

It’s very common that you need to get only a specific number of records from your database. For example, a blog which is showing ten entries per page. In that case, you should definitely use the LIMIT parameter, which only selects the desired number of records.
Without LIMIT, if your table has 100,000 different records, you’ll extract them all, which is unnecessary work for your server.

SELECT title, excerpt, author FROM wp_posts LIMIT 10;

Avoid queries in loops

When using SQL along with a programming language such as PHP, it can be tempting to use SQL queries inside a loop. But doing so is like hammering your database with queries.
This example illustrates the whole “queries in loops” problem:

foreach ($display_order as $id => $ordinal) {
    $sql = "UPDATE categories SET display_order = $ordinal WHERE id = $id";
    mysql_query($sql);
}

Here is what you should do instead:

UPDATE categories
    SET display_order = CASE id
        WHEN 1 THEN 3
        WHEN 2 THEN 4
        WHEN 3 THEN 5
    END
WHERE id IN (1,2,3)

Source: http://www.karlrixon.co.uk/articles/sql/update-multiple-rows-with-different-values-and-a-single-sql-query/

Use join instead of subqueries

As a programmer, subqueries are something that you can be tempted to use and abuse. Subqueries, as show below, can be very useful:

SELECT a.id,
    (SELECT MAX(created)
    FROM posts
    WHERE author_id = a.id)
AS latest_post FROM authors a

Although subqueries are useful, they often can be replaced by a join, which is definitely faster to execute.

SELECT a.id, MAX(p.created) AS latest_post
FROM authors a
INNER JOIN posts p
    ON (a.id = p.author_id)
GROUP BY a.id

Source: http://20bits.com/articles/10-tips-for-optimizing-mysql-queries-that-dont-suck/

Be careful when using wildcards

Wildcards are very useful because they can substitute for one or more characters when searching for data in a database. I’m not saying that you shouldn’t use them, but instead, you should use them with caution and not use the full wildcard when the prefix or postfix wildcard can do the same job.
In fact, doing a full wildcard search on a million records will certainly kill your database.

#Full wildcard
SELECT * FROM TABLE WHERE COLUMN LIKE '%hello%';
#Postfix wildcard
SELECT * FROM TABLE WHERE COLUMN LIKE  'hello%';
#Prefix wildcard
SELECT * FROM TABLE WHERE COLUMN LIKE  '%hello';

Source: http://hungred.com/useful-information/ways-optimize-sql-queries/

Use UNION instead of OR

The following example use the OR statement to get the result:

SELECT * FROM a, b WHERE a.p = b.q or a.x = b.y;

The UNION statement allows you to combine the result sets of 2 or more select queries. The following example will return the same result that the above query gets, but it will be faster:

SELECT * FROM a, b WHERE a.p = b.q
UNION
SELECT * FROM a, b WHERE a.x = b.y

Source: http://www.bcarter.com/optimsql.htm

Use indexes

Database indexes are similar to those you can find in libraries: They allow the database to find the requested information faster, just like a library index will allow a reader to find what they’re looking for without loosing time.
An Index can be created on a single column or a combination of columns in a database table. A table index is a database structure that arranges the values of one or more columns in a database table in specific order.

The following query will create an index on the Model column from the Product table. The index is called idxModel:

CREATE INDEX idxModel ON Product (Model);

Source: http://www.sql-tutorial.com/sql-indexes-sql-tutorial/

More posts about SQL

Share this article

 

Comments (60) - Leave yours

  1. Bart Jacobs said:

    This article is very useful for anyone working with mySQL (and with databases in general). For high traffic websites especially, database queries can be the bottleneck (and often are). Thanks for this great post!

  2. Metric Stormtrooper said:

    your table type can also make a lot of difference. MyISAM is better if you expect a lot of reads, InnoDB if you need more writes in a table (see also http://www.softwareprojects.com/resources/programming/t-mysql-storage-engines-1470.html ).

    And don’t forget to enable the query cache in your my.ini :)

      • Dirk Lang said:

        But as always, think before you add lot’s of indexes at your tables – put them only at columns where you intend to do a search. Using too much indexes can slow down the database, too. (I know, it is tricky)

        • Sebastiaan Stok said:

          And don’t make indexes on tables that are to small.
          Unless you have have more then 30 records this will only slow things down.

          And most important.
          Monitor changes, optimize when things go wrong.

          “your table type can also make a lot of difference. MyISAM is better if you expect a lot of reads, InnoDB if you need more writes in a table”
          Choosing engine-type on speed is a bad thing, I personally work with PostgreSQL so this not an issue for me :)
          If you prefer speed over consistency of the data, you’re doing it wrong!

  3. Anonymous said:

    One thing anyone thinking about optimising their queries should bear in mind is not to do it straight away. Get your code working and then go back to it and improve if necessary.

    en.wikipedia.org/wiki/Program_optimization#When_to_optimize

    Also, you missed the word ‘may’ when you mentioned UNION. Your source states “may run faster than”, not “will be faster”. I’ve found that OR is faster than UNION and a quick google on the subject confirms this:

    mysqlperformanceblog.com/2007/10/05/union-vs-union-all-performance/

    Your source, bcarter.com/optimsql.htm was apparently last updated in 1996 and I’d like to think MySQL had changed quite a bit since then.

    If anyone’s looking into speeding up their MySQL stuff then mysqlperformanceblog.com is highly recommended, the English isn’t always great though and the navigation sucks but give it a try.

    To expand on what Dirk so rightly said, Indexes are great for tables where there will be a lot of read (SELECT) operations. But if you’re often writing to a table then the Indexes can slow it down because of course the index must be updated.

    Could you please consider an update Jean, which reflects what Dirk and I have mentioned?

    • Jean-Baptiste Jung said:

      You’re right about the bcarter source: I didn’t checked the “last update” date when writing the article, sorry for this.

      I rarely update posts, but I’ll rewrite some parts this week, according to my busy schedule.

      Thanks a lot for your interest!

  4. Timothy said:

    Also, here are a few more details that could help:

    1) Be wise with column types (VarChar, Blob, Clob / Long Text, etc). When using something like int, varchar, etc with a set length the SQL server will know what to expect when conducting queries. If you use something like Clob / Long Text, which does not have a set limit, then the server will not know what to expect, so the queries have to way of being optimized. Basically, it means that every query that involves a column with such a type will be slow. Only use these variable types if you need to.

    2) Using trims or other string modifiers will greatly slow down your queries. The server will have to do this on each row. Pretty logical when you think about it. Any time you are modifying a string or other variable type within a query you are adding a great deal of overhead. If you can, put these kind of things on the server script side (PHP, Ruby, etc).

    3) If you have two tables, with 1 to 1 for every (key word there) row, then why not use just one table? Don’t make more tables than you need. Just because you’ll have more columns doesn’t mean you will have to SELECT them all.

    4) If you can, avoid putting certain things in a table. Like images. I often find out that people store certain images in a table as BLOBs and a column for the image mimetype. Then they query the image and use something like PHP to spit it out onto the page. The only time I’ve really seen this to be necessary is with CAPTCHA images. Where the image is stored as a BLOB alongside a hash representing the text in the image. Though, you could probably put the images in a folder with random filenames and have a table matching filenames to hash values. But that can open up some security risks (like black-hats creating their own look-up table so that they can produce scripts).

    • Daniel said:

      in reply to your tip 3:

      i disagree. sometimes it can make sense to split up a table into two, because searching in very wide (lots of columns) tables also has a performance impact. splitting up tables is pretty useful if you only search on a couple of fields in this table, and the majority of the other fields is just “additional info” such as timestamps, user_ip, file locations etc. you never consider when looking up records.
      this also means that it is easier to keep your “index-table” free from variable length data types as described in tip 1 :)

  5. Tomasz Kowalczyk said:

    Very good list, I would also add using ORMs such as Doctrine / Propel [included in frameworks such as symfony] that are caching everything – no need to remember that things.

    • Sebastiaan Stok said:

      NEVER EVER USE ORM!!!
      That is performance killer number one.

      There are to many bad things about Object Relation Mapping.
      * You can’t, you just can’t optimize the query (no realy)
      * Having an oject for every row is heavy of memory usage
      * I even heard about an option to buffer query execution, to do it in one time. That is not only stupid. But it is also very dangerous! I use transactions allot and using an ORM system will break this.

      Some one I know decided on using an ORM, and is not very happy with it.
      But he can’t switch since that will break the coding.

      And something else, ORM will make you brake with almost very design principle and pattern there is.

      Just use plain old SQL, it is much better and flexible.

  6. CoryMathews said:

    “Use join instead of sub queries”

    Depends on if the sub query is in the select from or where statement. So this advice while normally true for selects is not always true for the from or where statement. Thus is false.

    “Use UNION instead of OR”

    His benchmark says that but every db I have worked with says otherwise..

  7. James Edinburgh said:

    Thanks for this information.

    I’m just getting started in the world of SQL and databases and this is some solid advice. I’ve been playing around with WordPress and other PHP based CMS systems and I really really want to get started and build my own system, i’ll be sure to put these tips to work when I do.

  8. Harsh Athalye said:

    Good SQL tips. Optimizing queries is such a huge topic in itself, that a book can be written around it. But basically tips mentioned here are good enough. I would add some more:

    1. Choose wise columns when creating indexes. Creating indexes on data types of char, varchar or such other sizable data types can kill the performance
    2. If you have indexes on columns, try to avoid wrapping column inside a function
    3. Convert IN clause to joins
    4. Set based queries definitely beat procedural approach like cursors in terms of performance

  9. Brian said:

    Great tips there, I’m suprised you’ve not mentioned the mysql query cache, qcache. This can dramatically speed up mysql especially on high traffic sites.

    I must admit i’d not given much thought to design, I lumped all the fields in one table which was fine when the site had 100 entries but now it has over 3000 things are really starting to suffer and I’m having to strip out the database and organise it.

    It goes to show that a little bit of haste can really turn into a lot of waste later on.

  10. LISA TORRES said:

    The server will have to do this on each row. Pretty logical when you think about it. Any time you are modifying a string or other variable type within a query you are adding a great deal of overhead.

  11. Jack colt said:

    Thanks for the tips. Most of my websites are really slow in loading. I tried using WP super chache without much success. It appears now that the culprit lies in my design and database.

  12. Bob said:

    “I tried using WP super cache without much success.”

    It is not working for me either. Actually, while it is supposed to speed up my blog, it is actually slowing the blog down :( I would try using the database optimization tips laid out in this article. :)

  13. Edward said:

    This post is very usefull, thank you
    I also heard about something called stored queries so you just call it by its name when you do the query for example “getnews()” but I’m a complete newbie !

  14. Jamie said:

    This is indeed very helpful. I appreciate the fact that you are providing screenshots so we would know we are on the same track while following your tutorial. Thanks so much for sharing.

  15. Johnson said:

    Great tips, i’ve been wondering if i could customise my wordpress database, this is too slow as posts accurelates, anyone have interests ?

  16. Charlie said:

    All good tips, particularly on the use of UNION over OR.

    As far as I know, in most SQL databases, some uses of OR prevent the use of an index. So for large tables, it is faster to scan the indices multiple times rather than searching the entire table once.

  17. Sumanta Sinha said:

    Lately one of my PHP scripts is making way too many sql queries while it could easily do the job by making fewer queries. The reason: the script developer did not bother to optimize the scripts so it could use the least possible resource. He has also stopped supporting the project a year ago. Thanks to my web host, I would have never had known about this. I would try your tips and see if it helps at all. thanks :)

  18. Felipe said:

    Always refreshing to see someone pop open the hood and show how it is done in the age of “just-use-my-cuttie-pretty-web-2.0-framework-and-it-will-take-care-of-everything-for-you-in-a-pinch”. Congrats.

  19. Shawn said:

    I know it’s a bit of a late post but…..”Avoid queries in loops,” while i agree with you that you shouldn’t keep sending SQL to a database while looping, i think your example is *way* to simple to be of use. You are essentially hard coding the case values which 1. Shouldn’t be hard coded and 2. only manageable for a very small set. I believe a better alternative is to go ahead and use the looping statement and GENERATE the SQL needed. Ex. Update value 1; Update value 2; update value 3, update value N. That is, to aggregate the SQL into a string variable, AND THEN send it to the database only once and after it is generated.

  20. davidM said:

    The tip of using the limit on a ‘Select *’ query is good.

    How often do you find you are needing to get hold of table meta data, limiting to a single row will speed this up.

    Even better is if you can use a special object / class for getting your meta data, it can the use pre-compiled queries (which are also quicker).

    These things all become more important as your tables grow in size, however only start to think about the need to optimise your queries and tables if you are getting a lot of trafic and it is starting to impact on your performance. A more easily readable, and immediately understandable
    select * from table
    query is better than listing a bunch of column names, and getting lost in a long query.

    This is especially true if you are selecting everything apart from those columns you know to be problematic (ie blog, clob columns). Personally I have a method in my code that takes the db meta data and creates a select with specified columns removed, it essentially does

    select * (but not columns ‘Blahone’ ‘Blahtwo’…) from table

    That and stored procedures.

  21. Todd Googins said:

    I have a site that uses a php/mySQL reservation system, and the associated calendars are severely slowing down the page loads. Unfortunately, I am not a coder in any way. Can anyone suggest where I might find someone to hire to optimize/fix the system? Any advice is very much appreciated.

    -Todd

Leave a Reply

Your email address will not be published. Required fields are marked *

Please respect the following rules: No advertising, no spam, no keyword in name field. Thank you!