No this isn’t the same thing as pushing tables together in a restaurant (although you should be considerate there as well, since there may be those who do not wish to sit near you). I am thinking about SQL joins and how they can completely wreck your processing time if they are just added willy-nilly.
Keep in mind, I am not a DBA and there are others out there smarter than I when it comes to writing good query, but this is something I noticed when troubleshooting an SSRS report recently. My thought is that if it is happening here, then it is happening somewhere else as well.
The query that had been used to run a particular report was getting results, when asked for items within a date range of 1-1-2013 and 8-6-2013 it did return some 1900 records and the data was even correct. The data was being placed in Excel and manipulated to look usable to others which was not too bad. The problem was, to get 1900 records out was forever, or 30-45 minutes.
So of course I ran the query as is in SQL Server Management Studio and waited it out to see just how long it might take. Then I stared at the query. When the sailboat… wait that’s a Seinfeld episode. After several minutes of scrolling through the query content, searching the Internet, and more cups of coffee to caffeinate the problem, I decided that I needed a push in the right direction. I called a partner company just to get a pointer or two.
Why would a query take so long to return 1900 distinct rows?
In this case it was an unused join. However I didn’t notice until pointed in the direction of “you have a bad join”, then it was almost pointing and laughing it became so noticeable.
Since there were no records in the table being joined, there was no need to include it. I think the query was written and rewritten so many times to change the selected content and didn’t actually fail that nobody thought to remove the join. Once the line was removed, the 1900 results came back in 2 seconds. Much the improvement over 31 minutes from previous attempts.
How can you avoid these fun little issues?
Be very careful when joining tables, in a measure twice/cut once kind of way. If you are using a join and decide to go a different direction in the query, at least comment it out. Doing so will speed up processing because the database will not need to check that join to see if anything needs to be returned that would use it.
I am no SQL pro by any means, but I learn a bit more all the time and keeping things simple (or crammed full of comments when things don’t work quite right) is definitely a good place to start. This one was a head scratcher for me for a while today, but once fixed reminded me to think about the query and the results expected. Since datasets can quickly become large and often times those needing data only need a small amount, working with queries (and parts of queries) in their simplest form will be a huge time and stress saver. Especially for those of us who “also do SQL” and a fair amount of everything else.