Monday, April 21, 2014

SQL Development : Query Tips

Here is my checklist for efficient queries!


SELECT

Always minimize the data set being returned
  • Only SELECT the columns you need
  • Use a WHERE clause to only bring back the rows you need.

Never use SELECT * in production code.
  • This is lazy, NAME the columns you need
  • If you do not need all the columns, you are wasting I/O, network traffic and memory by returning them all.
  • Even if you do currently need them all, if columns to the queried tables are added at a later stage, you’ll return them too.
  • If your query joins tables, then multiple columns of the same name may be returned e.g. id columns.
  • SELECT * will tell the optimizer you need all columns, which may conclude that a table scan may be the most efficient way to return them.
  • The ordinal position of columns should not be relied upon for returning data.

FROM clauses.


Qualify object names with their Owner/Schema name.
  • This prevents SQL looking up the schema of the object (table/view/stored procedure/function)
  • The default schema is dbo, but it can be set on a user basis. 
  • Using fully qualified object names simplifies matters and prevents confusion at implementation.

Further Reading -

JOIN

JOIN tables according to the ANSI 92 syntax, using JOIN conditions named in the ON clause.

The ANSI SQL-92 standard provided more specific join syntax, with join conditions named in the ON clause.

SELECT [columnlist]  FROM Schema1.Table1
[ INNER | LEFT OUTER | RIGHT OUTER | FULL OUTER | CROSS ] JOIN Schema1.Table2
ON Table1.id = Table2.id

Avoid the old ANSI-89 syntax >

SELECT [columnlist] FROM Schema1.Table1, Schema1.Table2 WHERE Table1.id = Table2.id;

Ensure JOINed columns have matching data types


UNION

Use UNION and UNION ALL appropriately; use the latter in preference if possible i.e. you are not testing for uniqueness across the results set.


WHERE

  • Ensure WHERE clauses (and JOIN conditions) are Sargable.
  • Compare alike data types to prevent Implicit Conversions.
  • Use the least likely true AND expression first. This prevents other options being evaluated if it is FALSE.
  • Place the most likely value first if providing a list in an IN clause.
  • Adopt Boolean Logic if creating Conditional WHERE clauses
  • If doing extensive searching of text data, consider implementing Full Text Searching and CONTAINS.

GROUP BY

    Use DISTINCT instead if no aggregate function is required
    Use WHERE to restrict the size of the data set
    Use HAVING to manipulate group data.


ORDER BY


If the ordering of results is necessary,
  • Keep the width and/or number of sorted columns to the minimum
  • Keep the number of rows to be sorted to a minimum


Record Existence -

Evaluate how you Check for the existence of records -
  • Avoid using COUNT(*)
  • JOINs will return all matching rows, hence only use where a value is unique.
  • IN and EXISTS will only return a matching row once.
EXISTS vs IN (old argument)
  • EXISTS is traditionally preferred as it returned when it found the first match.
  • IN continues to evaluate all values.
  • It would appear the query optimizer now evaluates EXISTS and IN similarly.
Beware NULL values in tested columns.
NOT EXISTS ignores NULLs and therefore is preferable over using NOT IN or LEFT OUTER JOIN with a NULL condition check.

Ensure the column you are testing is indexed!

No comments: