Database indexes are your friend

I guess I assumed that creating a foreign key constraint would automatically create a non-clustered index.  I mean, a primary key is a clustered index, why wouldn’t a foreign key also create an index?

A batch process today slowed from completing in about 4 hours to an estimated completion time of around 3 days.  Since this was a process that needed to complete daily, well obviously it should complete in less than a day.  First order of business was looking at the profiler to see what was going on.

It was exactly one SELECT statement per transaction, no joins, and one predicate in the WHERE clause, on a foreign key.  And it took an average of 23 seconds to execute.  Yikes.

Next up was checking the execution plan and statistics.  This was on a table that doubled in size in one day, to 7 million rows.  The culprit naturally was a table scan, and the 230K reads showed why the query took so long.  A coworker pointed out that the statistics plan also provided a hint for performance tuning, to add an index.

I really couldn’t believe the foreign key didn’t have an index, but sure enough, it didn’t.  Added the index on the foreign key, and the query time was reduced from 23 seconds to nearly instantaneous, 2-3 orders of magnitude better.

And that 4 hour batch process?  It finished in about 5 minutes.

Yes, database indexes are your friend.  Assuming database indexes are already in place, not your friend.

*sigh*

Related Articles:

Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Jimmy Bogard

I'm a technical architect with Headspring in Austin, TX. I focus on DDD, distributed systems, and any other acronym-centric design/architecture/methodology. I created AutoMapper and am a co-author of the ASP.NET MVC in Action books.
This entry was posted in Misc. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Paul S

    Despite your bad experience here, I believe this is still the right way to go about doing things: Indexes should be added only when it’s observed that they are needed. Adding them before is just premature optimisation.

    Good post, thanks.

  • Steve

    Paul,

    Even on a foreign key? I mean, I believe in YAGNI as much as the next guy, but key’s should in all but the most extreme situation be indexed.

  • http://diegofrata.wordpress.com Diego F.

    “I mean, a primary key is a clustered index, why wouldn’t a foreign key also create an index?”

    Simply because they serve different purposes. A primary key defines the identity of a row, so it’s very likely it’ll be used for searches, updates and deletes.

    Foreign key is a normal field with a constraint that creates a reference to a foreign primary key. The foreign primary key is indexed. So if you have a select of A inner join B you will be searching B’s primary key for the value of the foreign key. Why would you need to search A? You don’t, by default.

    Indexes are costly, use it carefuly, they will become your enemies very fast if you don’t use it wisely. :)

  • Noel

    In SQL Server , by default the primary key is created as the clustered index. It doesn’t have to be however and you can cluster by any field you like, even non-unique fields. You can take advantage of this if you are often doing scans on a foreign key (ie all order items for an order). You can cluster dbo.OrderItem on [OrderItem].[OrderID] to keep all the order items together which will optimise this particular lookup.

  • Bill R.

    One can in fact imagine a scenario in which a foreign key might exist but no query will ever be executed against the table either selecting or sorting by that foreign key, thus excusing its absence.

    However, not to do so begs so many questions balanced ONLY against the “cost” or “storage” of building that index that one has to wonder about whether the database engine has been updated in any meaningful way since the 1980′s, when disk storage was thousands of dollars per Megabyte.

    Today, a 5 Terabyte RAID DASD server can be had for < $1,000. You have to be lazy, ignorant, severely thick-headed, or have some incredibly esoteric reason NOT to build the obvious indexes on a table.

  • Arnis L.

    I’ve been there too. Only I had little experience back then and it took awhile to figure it out. Query that couldn’t finish in minutes started to work in milliseconds.

  • Ray

    There is a simple reason why foreign keys should not get the index built automatically – they aren’t unique (usually). Imagine if there are only few possible values for the foreign key and one of them is used 90% of time on multi-million table. The index not only won’t improve the search, it will also create an overhead.