Monday 21 April 2014

SQL Development : Query Tips

Here is my checklist for efficient queries!


SELECT

Always minimize the data set being returned
  • Only SELECT the columns you need
  • Use a WHERE clause to only bring back the rows you need.

Never use SELECT * in production code.
  • This is lazy, NAME the columns you need
  • If you do not need all the columns, you are wasting I/O, network traffic and memory by returning them all.
  • Even if you do currently need them all, if columns to the queried tables are added at a later stage, you’ll return them too.
  • If your query joins tables, then multiple columns of the same name may be returned e.g. id columns.
  • SELECT * will tell the optimizer you need all columns, which may conclude that a table scan may be the most efficient way to return them.
  • The ordinal position of columns should not be relied upon for returning data.

FROM clauses.


Qualify object names with their Owner/Schema name.
  • This prevents SQL looking up the schema of the object (table/view/stored procedure/function)
  • The default schema is dbo, but it can be set on a user basis. 
  • Using fully qualified object names simplifies matters and prevents confusion at implementation.

Further Reading -

JOIN

JOIN tables according to the ANSI 92 syntax, using JOIN conditions named in the ON clause.

The ANSI SQL-92 standard provided more specific join syntax, with join conditions named in the ON clause.

SELECT [columnlist]  FROM Schema1.Table1
[ INNER | LEFT OUTER | RIGHT OUTER | FULL OUTER | CROSS ] JOIN Schema1.Table2
ON Table1.id = Table2.id

Avoid the old ANSI-89 syntax >

SELECT [columnlist] FROM Schema1.Table1, Schema1.Table2 WHERE Table1.id = Table2.id;

Ensure JOINed columns have matching data types


UNION

Use UNION and UNION ALL appropriately; use the latter in preference if possible i.e. you are not testing for uniqueness across the results set.


WHERE

  • Ensure WHERE clauses (and JOIN conditions) are Sargable.
  • Compare alike data types to prevent Implicit Conversions.
  • Use the least likely true AND expression first. This prevents other options being evaluated if it is FALSE.
  • Place the most likely value first if providing a list in an IN clause.
  • Adopt Boolean Logic if creating Conditional WHERE clauses
  • If doing extensive searching of text data, consider implementing Full Text Searching and CONTAINS.

GROUP BY

    Use DISTINCT instead if no aggregate function is required
    Use WHERE to restrict the size of the data set
    Use HAVING to manipulate group data.


ORDER BY


If the ordering of results is necessary,
  • Keep the width and/or number of sorted columns to the minimum
  • Keep the number of rows to be sorted to a minimum


Record Existence -

Evaluate how you Check for the existence of records -
  • Avoid using COUNT(*)
  • JOINs will return all matching rows, hence only use where a value is unique.
  • IN and EXISTS will only return a matching row once.
EXISTS vs IN (old argument)
  • EXISTS is traditionally preferred as it returned when it found the first match.
  • IN continues to evaluate all values.
  • It would appear the query optimizer now evaluates EXISTS and IN similarly.
Beware NULL values in tested columns.
NOT EXISTS ignores NULLs and therefore is preferable over using NOT IN or LEFT OUTER JOIN with a NULL condition check.

Ensure the column you are testing is indexed!

Wednesday 9 April 2014

SQL Development : Implicit Conversions

Implicit Conversions happen when differing data types are compared.

They can occur in WHERE clauses and table JOINS and can be seen in the execution plan of a query as a CONVERT_IMPLICIT predicate. Data Type Precedence often hides these issues as there are no obvious errors when developing, queries still return results etc!




Further Readfing -


Friday 4 April 2014

SQL Development : Sargability

When developing SQL queries you ideally want your WHERE clauses and JOIN conditions to be sargable.

The definition from Wikipedia of Sargability is -
In relational databases, a condition (or predicate) in a query is said to be sargable if the DBMS engine can take advantage of an index to speed up the execution of the query (using index seeks, not covering indexes). The term is derived from a contraction of Search ARGument Able.
Sargable conditions (can use indexes) are ;
  • =
  • >
  • <
  • >=
  • <=
  • EXISTS
  • IN
  • BETWEEN
  • LIKE 'searchterm%'

NOT conditions are not Sargable ;
  • IS NULL
  • <>
  • !=
  • !>
  • !<
  • NOT
  • NOT EXISTS
  • NOT IN
  • NOT LIKE

Expressions not matching the left of an index will not use an index either
  • LIKE '%searchterm'
  • LIKE '%searchterm%'

Using Functions on columns or comparing columns to other columns is also not Sargable.
Common examples of these are SUBSTRING, DATEDIFF in the WHERE-clause.
These prevent index use and cause table scans.


From Conor vs. more SARGable predicates -
  1. Avoid table variables for large results sets (as at Dec 2010)
  2. Avoid ORs – Prevent queries using a single range.
  3. Use IS NULL NOT ISNULL(expression, replacement_value)

Further Reading -