NOT IN
NOT IN filters rows by excluding values that match a list or subquery result. Use NOT IN in the Join tab to refine which rows are included in your Node.
Syntax
WHERE column NOT IN (value1, value2, ...)
Or with a subquery:
WHERE column NOT IN (subquery)
Using NOT IN in the Join Tab
In the Join tab, add NOT IN to your WHERE clause to exclude rows matching specific values.
Exclude a Static List
Filter orders with specific statuses:
FROM {{ ref('WORK', 'ORDERS') }} "ORDERS"
WHERE "ORDERS"."O_ORDERSTATUS" NOT IN ('F', 'O', 'P')
CASE
WHEN `ORDERS`.`O_ORDERPRIORITY` NOT IN ('1-URGENT', '2-HIGH') THEN 'Standard'
ELSE `ORDERS`.`O_ORDERPRIORITY`
END
Exclude Using A Subquery
Filter orders that don't have any associated line items:
FROM {{ ref('WORK', 'ORDERS') }} "ORDERS"
WHERE "ORDERS"."O_ORDERKEY" NOT IN (
SELECT "L_ORDERKEY"
FROM {{ ref('WORK', 'LINEITEM') }}
WHERE "L_ORDERKEY" IS NOT NULL
)
If the subquery contains NULL values, NOT IN may return zero rows. Always add WHERE column IS NOT NULL to your subquery to avoid this issue.
Platform Formatting
- BigQuery
- Databricks
- Snowflake
Use backticks for identifiers and AS for aliases:
FROM {{ ref('WORK', 'ORDERS') }} AS `ORDERS`
WHERE `ORDERS`.`O_CUSTKEY` NOT IN (
SELECT `S_SUPPKEY`
FROM {{ ref('WORK', 'SUPPLIER') }}
WHERE `S_SUPPKEY` IS NOT NULL
)
Use backticks for identifiers:
FROM {{ ref('WORK', 'ORDERS') }} `ORDERS`
WHERE `ORDERS`.`O_CUSTKEY` NOT IN (
SELECT `S_SUPPKEY`
FROM {{ ref('WORK', 'SUPPLIER') }}
WHERE `S_SUPPKEY` IS NOT NULL
)
Use double quotes for identifiers:
FROM {{ ref('WORK', 'ORDERS') }} "ORDERS"
WHERE "ORDERS"."O_CUSTKEY" NOT IN (
SELECT "S_SUPPKEY"
FROM {{ ref('WORK', 'SUPPLIER') }}
WHERE "S_SUPPKEY" IS NOT NULL
)
Alternatives
Using NOT EXISTS
NOT EXISTS handles NULL values more predictably and often performs better with large data sets:
FROM {{ ref('WORK', 'ORDERS') }} "ORDERS"
WHERE NOT EXISTS (
SELECT 1
FROM {{ ref('WORK', 'LINEITEM') }} "LINEITEM"
WHERE "LINEITEM"."L_ORDERKEY" = "ORDERS"."O_ORDERKEY"
)
Using LEFT JOIN with IS NULL
Use a LEFT JOIN in the Join tab to find non-matching rows:
FROM {{ ref('WORK', 'ORDERS') }} "ORDERS"
LEFT JOIN {{ ref('WORK', 'LINEITEM') }} "LINEITEM"
ON "ORDERS"."O_ORDERKEY" = "LINEITEM"."L_ORDERKEY"
WHERE "LINEITEM"."L_ORDERKEY" IS NULL
Choosing the Right Method
Use NOT IN when you have:
- A small, static list of values.
- Confidence that no NULL values exist in the comparison.
Use NOT EXISTS when you have:
- Subqueries that might return NULL values.
- Large data sets requiring optimization.
Use LEFT JOIN with IS NULL when you need:
- Performance optimization on very large tables.
- Explicit visibility into the anti-join logic.