SQL Refresher Cheat Sheet

Apr 8 / Ly Nguyen
About This Cheat Sheet

This cheat sheet is designed to help you refresh your SQL knowledge ahead of your upcoming Breakout Academy course, so you can hit the ground running on day one.

All queries in this document can be run in your dedicated learning environment, which has been prepared and shared with you.

The examples are based on a fictional e-commerce company and use three tables: orders_de, customers, and deliveries.

The Dataset

Let's get familiar with the 3 tables you will be working with.

orders_de - one row per order
ColumnTypeDescription
idintegerUnique order ID
customer_idintegerLinks to customers.id
total_amountdecimalOrder value in dollars
statustext'delivered', 'ordered', 'pending', 'cancelled'
created_attimestampWhen the order was placed
customers - one row per customer
ColumnTypeDescription
idintegerUnique customer ID
first_nametextCustomer's first name
last_nametextCustomer's last name
emailtextEmail address
created_attimestampWhen they signed up
deliveries - one row per delivery attempt
ColumnTypeDescription
idintegerUnique delivery ID
order_idintegerLinks to orders_de.id
statustext'delivered', 'picked_up', 'cancelled'
picked_up_attimestampWhen it was picked up
delivered_attimestampWhen it was delivered
SELECT & FROM

The foundation of every query. Pick which columns you want and which table they come from.

Select all columns
SELECT *
FROM raw.ecomm.orders_de
Select specific columns
SELECT
       id,
       customer_id,
       total_amount,
       status
FROM raw.ecomm.orders_de
Best practice: Avoid SELECT * in production - name columns explicitly. Makes queries easier to read and less likely to break when tables change.
WHERE & Filtering

Filters rows before any grouping happens. Only rows matching your condition are returned.

Common filter examples
WHERE status = 'delivered'
WHERE total_amount > 100
WHERE status != 'cancelled'
WHERE total_amount BETWEEN 50 AND 200
WHERE first_name LIKE 'J%'
WHERE email IS NULL
WHERE status IN ('delivered','ordered','pending')
Combine conditions
WHERE status = 'delivered'
  AND total_amount > 100
OperatorMeaning
ANDBoth conditions must be true
OREither condition can be true
NOTReverses the condition
Operators reference
OperatorMeaning
=Equal to
!=Not equal to
> / <Greater / less than
>= / <=Greater or less than or equal
INMatches a list of values
BETWEENInclusive range
LIKEPattern match (used together with "%")
The % wildcard symbol matches any sequence of characters. e.g. LIKE 'J%' matches any value starting with J.
IS NULLMissing / empty value
ORDER BY & LIMIT
Sort results
-- Highest first (DESC = descending)
ORDER BY total_amount DESC

-- Oldest first (ASC = ascending, default)
ORDER BY created_at

-- Sort by multiple columns
ORDER BY created_at DESC,
         total_amount DESC
Limit rows returned
-- Top 10 most recent orders
SELECT id, customer_id, total_amount
FROM raw.ecomm.orders_de
ORDER BY created_at DESC
LIMIT 10
Pattern: ORDER BY + LIMIT = "show me the latest N records."
JOINs

Combines rows from two tables based on a shared column. e.g. orders_de.customer_id links to customers.id.

INNER JOIN - matched rows only
SELECT raw.ecomm.orders_de.id,
       raw.ecomm.orders_de.total_amount,
       raw.ecomm.customers.first_name,
       raw.ecomm.customers.last_name
FROM raw.ecomm.orders_de
INNER JOIN raw.ecomm.customers
    ON raw.ecomm.orders_de.customer_id = raw.ecomm.customers.id
WHERE raw.ecomm.orders_de.status = 'delivered'
ORDER BY raw.ecomm.orders_de.total_amount DESC
LEFT JOIN - all left rows, NULLs if no match
SELECT raw.ecomm.orders_de.id,
       raw.ecomm.orders_de.status,
       raw.ecomm.deliveries.status AS delivery_status,
       raw.ecomm.deliveries.picked_up_at,
       raw.ecomm.deliveries.delivered_at
FROM raw.ecomm.orders_de
LEFT JOIN raw.ecomm.deliveries
    ON raw.ecomm.deliveries.order_id = raw.ecomm.orders_de.id
JOIN types
TypeReturns
INNER JOINOnly matching rows in both tables
LEFT JOINAll left rows + matches from right
RIGHT JOINAll right rows + matches from left
FULL JOINAll rows from both tables
ON vs WHERE: ON defines the join condition. WHERE filters the result after joining.
Aggregations & GROUP BY

Summarize data - totals, counts, averages - instead of returning row-by-row detail.

Aggregate functions
FunctionDoes
COUNT(*)Count all rows
COUNT(col)Count non-NULL values
SUM(col)Add up values
AVG(col)Calculate average
MIN(col)Lowest value
MAX(col)Highest value
GROUP BY
SELECT status,
  COUNT(*)          AS order_count,
  SUM(total_amount) AS total_revenue
FROM raw.ecomm.orders_de
GROUP BY status
ORDER BY total_revenue DESC
Rule: Every non-aggregate column in SELECT must appear in GROUP BY.
What this query does

Groups all rows in orders_de by their status value. For each status group, it counts how many orders exist and adds up the total revenue. The result is sorted so the highest-revenue status appears first - giving you a quick breakdown of performance by order status.

CASE & COALESCE

CASE works like an if/then statement inside a query. It checks conditions in order and returns the first match. If nothing matches, the ELSE value is returned.

CASE - if/then logic
SELECT id,
  CASE
    WHEN total_amount >= 50
      THEN 'High'
    WHEN total_amount >= 30
      THEN 'Medium'
    ELSE 'Low'
  END AS order_tier
FROM raw.ecomm.orders_de
This query labels each order as High, Medium, or Low based on its total amount - orders of €50+ are High, €30-49 are Medium, and everything else is Low. The result is returned as a new column called order_tier.
COALESCE - shorthand for a common CASE pattern

Returns the first non-NULL value from a list. It is essentially a shorter way of writing a CASE WHEN that checks for NULL.

Without COALESCE
SELECT
    customers.last_name,
    customers.first_name,
    orders_de.status
FROM raw.ecomm.customers
LEFT JOIN raw.ecomm.orders_de
    ON customers.id = orders_de.customer_id
last_namefirst_namestatus
ScroytonLotteNULL
NevinsAvadelivered
BonhamMadapending
With COALESCE
SELECT
    customers.last_name,
    customers.first_name,
    COALESCE(orders_de.status, 'no order') AS order_status
FROM raw.ecomm.customers
LEFT JOIN raw.ecomm.orders_de
    ON customers.id = orders_de.customer_id
last_namefirst_nameorder_status
ScroytonLotteno order
NevinsAvadelivered
BonhamMadapending
If a customer has no matching order, orders_de.status would be NULL. COALESCE replaces that NULL with 'no order' so the result is always readable.
Equivalent CASE WHEN
CASE WHEN orders_de.status IS NOT NULL
     THEN orders_de.status
     ELSE 'no order'
END
Note: Use COALESCE when you just need a fallback for NULL values. Use CASE WHEN when you need more complex conditional logic.
CTEs - Common Table Expressions

A CTE lets you define a temporary named result set at the top of your query using WITH. It makes complex queries easier to read and reason about - think of it as giving a subquery a name and moving it out of the way.

Basic CTE
WITH delivered_orders AS (
  SELECT id, customer_id, total_amount
  FROM raw.ecomm.orders_de
  WHERE status = 'delivered'
)
SELECT *
FROM delivered_orders
WHERE total_amount > 20
Multiple CTEs
WITH delivered_orders AS (
  SELECT id, customer_id, total_amount
  FROM raw.ecomm.orders_de
  WHERE status = 'delivered'
),
top_customers AS (
  SELECT customer_id,
         SUM(total_amount) AS total_spent
  FROM delivered_orders
  GROUP BY customer_id
)
SELECT customers.first_name,
       customers.last_name,
       top_customers.total_spent
FROM top_customers
INNER JOIN raw.ecomm.customers
    ON top_customers.customer_id = customers.id
ORDER BY top_customers.total_spent DESC
Rules
• Always starts with WITH
• Each CTE is separated by a comma
• The final SELECT comes after all CTEs, WITHOUT a preceding comma
• CTEs can reference other CTEs defined before them
Full Query - Putting It All Together
WITH customer_orders AS (
  SELECT
      orders_de.customer_id,
      COUNT(orders_de.id)         AS order_count,
      SUM(orders_de.total_amount) AS total_spent,
      MAX(orders_de.created_at)   AS last_order_date
  FROM raw.ecomm.orders_de
  WHERE orders_de.status IN ('delivered', 'ordered', 'pending')
  GROUP BY orders_de.customer_id
),
delivery_summary AS (
  SELECT
      orders_de.customer_id,
      COUNT(deliveries.id)               AS total_deliveries,
      SUM(CASE WHEN deliveries.status = 'delivered'
               THEN 1 ELSE 0 END)        AS successful_deliveries
  FROM raw.ecomm.orders_de
  LEFT JOIN raw.ecomm.deliveries
      ON deliveries.order_id = orders_de.id
  GROUP BY orders_de.customer_id
)
SELECT
    customers.first_name,
    customers.last_name,
    COALESCE(customer_orders.order_count, 0)            AS order_count,
    COALESCE(customer_orders.total_spent, 0)            AS total_spent,
    customer_orders.last_order_date,
    COALESCE(delivery_summary.total_deliveries, 0)      AS total_deliveries,
    COALESCE(delivery_summary.successful_deliveries, 0) AS successful_deliveries
FROM raw.ecomm.customers
LEFT JOIN customer_orders
    ON customers.id = customer_orders.customer_id
LEFT JOIN delivery_summary
    ON customers.id = delivery_summary.customer_id
ORDER BY customer_orders.total_spent DESC
LIMIT 20
What this does
SectionWhat happens
CTE 1Filters active orders, groups by customer, calculates order count, total spend and last order date
CTE 2Counts total and successful deliveries per customer
LEFT JOINsJoins all customers to their order and delivery summaries
COALESCEReplaces NULL with 0 for customers with no orders or deliveries
ORDER BYHighest spenders first
LIMITTop 20 only
SQL Query Writing Order

This is the order you write a SQL query as a developer:

1. SELECT
->
2. FROM
->
3. JOIN
->
4. WHERE
->
5. GROUP BY
->
6. ORDER BY
->
7. LIMIT
What's Next

You now have the required SQL foundation for your Breakout Academy course. You're ready to jump in, happy learning!