A Beginner's Guide to the True Order of SQL Operations

Reading a blog post about the true order of SQL operations. I was having trouble with SQL earlier and found this article, so I am going to read it.

Date Created:
2 433

References



Notes


The lexical order of operations does not match the logical order of operations.

GROUP BY contributes the most confusion:

-- Wrong
SELECT first_name, count(*)
FROM customer
WHERE count(*) > 1
GROUP BY first_name

-- Correct
SELECT first_name, count(*)
FROM customer
GROUP BY first_name
HAVING count(*) > 1

-- Correct
SELECT first_name, count(*)
FROM customer
GROUP BY first_name
ORDER BY count(*) DESC

-- Wrong
SELECT first_name, last_name, count(*)
FROM customer
GROUP BY first_name

-- Correct
SELECT first_name, MAX(last_name), count(*)
FROM customer
GROUP BY first_name

-- Wrong
SELECT first_name || ' ' || last_name, count(*)
FROM customer
GROUP BY first_name

-- Correct
SELECT first_name || ' ' || MAX(last_name), count(*)
FROM customer
GROUP BY first_name

-- Correct
SELECT MAX(first_name || ' ' || last_name), count(*)
FROM customer
GROUP BY first_name

Even though SQL is written in plain English, all the different SQL clauses have extremely complex dependencies. To find out all the interdependencies, read the SQL standard documents. The lexical (syntactical) order of SQL operations (clauses) does not correspond at all to the logical order of operations (although, sometimes they do coincidentally). Thanks to modern optimizers, the order also doesn't correspond to the actual order of operations, so we really have: syntactical logical actual order. The logical order of operations:

  • FROM
    • This is actually the first thing that happens, logically. Before anything else, we're loading all the rows from all the tables and join them. This is what logically happens first, not actually (the optimizer will probably not do this operation first, but instead access some index based on the WHERE clause). JOIN clauses are actually part of the FROM clause.
  • WHERE
    • Once we have loaded all the rows from the tables above, we can now throw them away again using WHERE
  • GROUP BY
    • If you want, you can take the rows that remain after WHERE and put them in groups or buckets, where each group contains the same value for the GROUP BY expression (and all the other rows are put in a list for that group).
  • aggregations
    • Aggregate functions are calculated right after GROUP BY logically. This explains why we cannot put an aggregate function in the WHERE clause, because its value cannot be accessed yet.
  • HAVING
    • Now you can access aggregate function values. For instance, you can check that COUNT(*)>1 in the HAVING clause.
  • WINDOW
  • SELECT
    • We now use all the rows that are produced from the above clauses and create new rows / tuples from them using SELECT.
  • DISTINCT
    • DISTINCT happens after SELECT
  • UNION, INTERSECT, EXCEPT
    • A UNION is an operator that connects two subqueries. Everything we've talked about thus far was a subquery.
  • ORDER BY
    • It makes sense to postpone the ordering of a result until the end.
  • OFFSET
    • Don't use OFFSET
  • LIMIT, FETCH, TOP
    • Sane databases put the LIMIT (MySQL, PostgreSQL) or FETCH (DB2, Oracle 12c, SQL Server 2012) clause at the very end, syntactically.


If you ever want to do something that is not in the right order, the simplest track is always to resort to a derived table:

-- Doesn't work, cannot put window functions in GROUP BY
SELECT ntile(4) ORDER BY (age) AS bucket, MIN(age), MAX(age)
FROM customer
GROUP BY ntile(4) ORDER BY (age)

-- Works:
SELECT bucket, MIN(age), MAX(age)
FROM (
  SELECT age, ntile(4) ORDER BY (age) AS bucket
  FROM customer
) c
GROUP BY bucket

Always think about the logical order of operations


Further Reading


Comments

You have to be logged in to add a comment

User Comments

Insert Math Markup

ESC
About Inserting Math Content
Display Style:

Embed News Content

ESC
About Embedding News Content

Embed Youtube Video

ESC
Embedding Youtube Videos

Embed TikTok Video

ESC
Embedding TikTok Videos

Embed X Post

ESC
Embedding X Posts

Embed Instagram Post

ESC
Embedding Instagram Posts

Insert Details Element

ESC

Example Output:

Summary Title
You will be able to insert content here after confirming the title of the <details> element.

Insert Table

ESC
Customization
Align:
Preview:

Insert Horizontal Rule

#000000

Preview:


View Content At Different Sizes

ESC

Edit Style of Block Nodes

ESC

Edit the background color, default text color, margin, padding, and border of block nodes. Editable block nodes include paragraphs, headers, and lists.

#ffffff
#000000

Edit Selected Cells

Change the background color, vertical align, and borders of the cells in the current selection.

#ffffff
Vertical Align:
Border
#000000
Border Style:

Edit Table

ESC
Customization:
Align:

Upload Lexical State

ESC

Upload a .lexical file. If the file type matches the type of the current editor, then a preview will be shown below the file input.

Upload 3D Object

ESC

Upload Jupyter Notebook

ESC

Upload a Jupyter notebook and embed the resulting HTML in the text editor.

Insert Custom HTML

ESC

Edit Image Background Color

ESC
#ffffff

Insert Columns Layout

ESC
Column Type:

Select Code Language

ESC
Select Coding Language

Insert Chart

ESC

Use the search box below

Upload Previous Version of Article State

ESC