Skip to main content

Parallax effect

Excursions in SQL Server: Recursive CTE explained

Common Table Expressions (CTEs) have been a feature of SQL Server since 2005.  It is an under utilized feature mostly because you can NOT use them and write efficient code.  However, understanding what they can do, what they're good at, and what they're not good at will add to any developer's tool belt.  CTEs are a lot like views, only less rigid.  They can also reference themselves and as will be shown, can be used for recursive calls.  



Syntax:

From the MSDN Website:

[ WITH <common_table_expression> [ ,...n ] ]
<common_table_expression>::=
expression_name [ ( column_name [ ,...n ] ) ]
AS
( CTE_query_definition )

All CTE’s begin with the WITH statement followed by the CTE’s name and column names.  This of this as similar to declaring a variable table only without the datatypes.

After the CTE is defined, inside the AS statement is where the query definition begins.  This is the select statement used to populate the table part of the Common Table Expression.  It is also where the recursion is defined.

Immediately after the definition of the CTE, a single SELECT, INSERT, UPDATE, or DELETE must follow.

Simple Example:

This example is showing the functionality of a CTE.  
WITH invoicesThisYear
(
      ivh_billto
    , ivh_invoicenumber
    , ivh_invoicestatus
    , ivh_billmonth
)
    AS
    (
        SELECT
           ivh_billto
         , ivh_invoicenumber
         , ivh_invoicestatus
         , MONTH(ivh_billdate) ivh_billmonth
         FROM
           invoiceheader WITH (NOLOCK)
         WHERE
           ivh_billdate > DATEADD(year , DATEDIFF(year , 0 , GETDATE()) , 0)
     )
     SELECT
       ivh_billto
     , ivh_billmonth
     , ivh_invoicestatus
     , COUNT(ivh_invoicenumber) AS ivh_totalInvoices
     FROM
       invoicesThisYear
     GROUP BY
       ivh_billto
     , ivh_invoicestatus
     , ivh_billmonth
     ORDER BY
       ivh_billto
     , ivh_billmonth
     , ivh_invoicestatus;

SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 7 ms.

(xxx row(s) affected)
Table 'invoiceheader'. Scan count 6, logical reads 6506, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 654 ms,  elapsed time = 332 ms.
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

We declare the CTE (invoicesThisYear) after the WITH statement and define the columns of the CTE - data types are derived.  Inside the AS statement is where the magic happens; we are gathering all of the invoices where the ivh_billdate is this year and obtaining the MONTH part of the ivh_billdate.

As mentioned, you can NOT use a CTE and obtain the same results and the query will function similarly.
SELECT
  ivh_billto
, MONTH(ivh_billdate) AS      ivh_billmonth
, ivh_invoicestatus
, COUNT(ivh_invoicenumber) AS ivh_totalInvoices
FROM
  invoiceheader
WHERE
  ivh_billdate > DATEADD(year , DATEDIFF(year , 0 , GETDATE()) , 0)
GROUP BY
  ivh_billto
, MONTH(ivh_billdate)
, ivh_invoicestatus;
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 5 ms.

(273 row(s) affected)
Table 'invoiceheader'. Scan count 6, logical reads 6609, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 638 ms,  elapsed time = 342 ms.
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:

   CPU time = 0 ms,  elapsed time = 0 ms.


However, this is simple example isn't leveraging one of the main features of CTEs - Recursion!

Simple Recursive Example:

This example shows how to get every year from 2002 until today.
WITH yearsAgo
(
  myYear
)
AS
(
     
-- Base Case
     
SELECT DATEPART(year , GETDATE()) AS myYear
     
UNION ALL
     
-- Recursive Section
     
SELECT yearsAgo.myYear - 1
     
FROM yearsAgo
     
WHERE yearsAgo.myYear >= 2002
)
     
SELECT * FROM yearsAgo;

 We can reflect to one of our college class projects which in its scope, required the use of recursion.  It probably followed a night of drinking caffeinated beverages and debugging stack overflow exceptions.

We declare the CTE (yearsAgo) after the WITH statement and define the columns of the CTE - data types are derived.  Inside the AS statement is where the magic happens; we define the base case or starting point for the recursion.  In the recursive section, we tell what we want to do (we want the previous year from the last part recursive value) and define when we want the recursion to stop.  Finally, we're selecting all of the values from our CTE (yearsAgo).
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001

What happens if you have a runaway recursive call (remember the stack overflow exception)?  Fortunately SQL Server gives us a solution to this concern:

Using MAXRECURSION:

MAXRECURSION is a query option that will limit the number of recursive calls.

WITH yearsAgo
(
  myYear
)
AS
(
     
-- Base Case
     
SELECT DATEPART(year , GETDATE()) AS myYear
     
UNION ALL
     
-- Recursive Section
     
SELECT yearsAgo.myYear - 1
     
FROM yearsAgo
     
WHERE yearsAgo.myYear >= 2002
)
     
SELECT * FROM yearsAgo
     OPTION (MAXRECURSION 10);
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 1 ms.
Msg 530, Level 16, State 1, Line 2The statement terminated. The maximum recursion 10 has been exhausted before statement completion.

SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.  SQL Server Execution Times:   CPU time = 0 ms,  elapsed time = 0 ms.

I hope this helps demystify Common Table Expressions and recursive CTEs.  It is a niche feature, but there are times when no other option exists to solve a problem.