Skip to main content

NOT IN

NOT IN filters rows by excluding values that match a list or subquery result. Use NOT IN in the Join tab to refine which rows are included in your Node.

Syntax

WHERE column NOT IN (value1, value2, ...)

Or with a subquery:

WHERE column NOT IN (subquery)

Using NOT IN in the Join Tab

In the Join tab, add NOT IN to your WHERE clause to exclude rows matching specific values.

Exclude a Static List

Filter orders with specific statuses:

FROM {{ ref('WORK', 'ORDERS') }} "ORDERS"
WHERE "ORDERS"."O_ORDERSTATUS" NOT IN ('F', 'O', 'P')
CASE 
WHEN `ORDERS`.`O_ORDERPRIORITY` NOT IN ('1-URGENT', '2-HIGH') THEN 'Standard'
ELSE `ORDERS`.`O_ORDERPRIORITY`
END

Exclude Using A Subquery

Filter orders that don't have any associated line items:

FROM {{ ref('WORK', 'ORDERS') }} "ORDERS"
WHERE "ORDERS"."O_ORDERKEY" NOT IN (
SELECT "L_ORDERKEY"
FROM {{ ref('WORK', 'LINEITEM') }}
WHERE "L_ORDERKEY" IS NOT NULL
)
NULL Values Can Return Unexpected Results

If the subquery contains NULL values, NOT IN may return zero rows. Always add WHERE column IS NOT NULL to your subquery to avoid this issue.

Platform Formatting

Use backticks for identifiers and AS for aliases:

FROM {{ ref('WORK', 'ORDERS') }} AS `ORDERS`
WHERE `ORDERS`.`O_CUSTKEY` NOT IN (
SELECT `S_SUPPKEY`
FROM {{ ref('WORK', 'SUPPLIER') }}
WHERE `S_SUPPKEY` IS NOT NULL
)

Alternatives

Using NOT EXISTS

NOT EXISTS handles NULL values more predictably and often performs better with large data sets:

FROM {{ ref('WORK', 'ORDERS') }} "ORDERS"
WHERE NOT EXISTS (
SELECT 1
FROM {{ ref('WORK', 'LINEITEM') }} "LINEITEM"
WHERE "LINEITEM"."L_ORDERKEY" = "ORDERS"."O_ORDERKEY"
)

Using LEFT JOIN with IS NULL

Use a LEFT JOIN in the Join tab to find non-matching rows:

FROM {{ ref('WORK', 'ORDERS') }} "ORDERS"
LEFT JOIN {{ ref('WORK', 'LINEITEM') }} "LINEITEM"
ON "ORDERS"."O_ORDERKEY" = "LINEITEM"."L_ORDERKEY"
WHERE "LINEITEM"."L_ORDERKEY" IS NULL

Choosing the Right Method

Use NOT IN when you have:

  • A small, static list of values.
  • Confidence that no NULL values exist in the comparison.

Use NOT EXISTS when you have:

  • Subqueries that might return NULL values.
  • Large data sets requiring optimization.

Use LEFT JOIN with IS NULL when you need:

  • Performance optimization on very large tables.
  • Explicit visibility into the anti-join logic.

What's Next?