SCD Type 2 Change Detection and String Case
This page explains how Coalesce decides a tracked attribute changed for SCD Type 2 Dimension Nodes, how letter case affects that decision, and what you can do in SQL when you want values such as Monday and monday treated as the same.
How Type 2 Detects a Change
Coalesce uses your Dimension Node Advanced Deploy settings, including change tracking columns and business keys, to build MERGE or equivalent refresh logic for your warehouse. When Coalesce compares the incoming row with the current row, it relies on SQL equality for the columns that participate in that logic. If the database treats the old and new values as not equal, the pipeline can treat the row as changed, which may create a new historical version for Type 2.
Change Tracking Columns
The attributes you selected so that updates create a new historical version instead of only overwriting the current row. Changes to those columns drive Type 2 behavior, together with your business keys and any Last Modified Comparison options your package documents in Advanced Deploy. Coalesce does not apply a separate string comparison mode: detection follows the comparison rules of the warehouse for the data types and expressions in the generated SQL.
String Comparison by Warehouse
Equality for string-like types follows your platform defaults unless you change expressions or collation. The examples below use simple string literals. Always confirm behavior for your data types, collation, and session settings.
- BigQuery
- Databricks
- Snowflake
In BigQuery:
- STRING comparisons are case-sensitive by default.
- Example:
'Monday' != 'monday'. - BigQuery SQL has no
ILIKE. For case-insensitive pattern matching, useREGEXP_CONTAINSwith a case-insensitive regular expression, for example the(?i)flag in the pattern, or normalize withUPPER()orLOWER()and compare those results. UPPER()andLOWER()are available for case normalization in expressions.
In Databricks with Spark SQL:
- String comparisons are case-sensitive by default.
- Example:
'Monday' != 'monday'. ILIKEis available for case-insensitive pattern matching.- You can use
UPPER()orLOWER()for case-insensitive equality.
In Snowflake:
VARCHARequality is case-sensitive by default.- Example:
'Monday' != 'monday'. - You can make comparisons case-insensitive in supported contexts using a
COLLATEclause when your account and data type allow case-insensitive collation, or by comparingUPPER()orLOWER()results. - Default collation is case-sensitive.
When Letter Case Alone Appears as a Change
If a source sends Monday today and monday tomorrow on the same business key, and the changed attribute is change-tracked, the pipeline may open a new Type 2 version because the values are not equal under SQL rules. That is expected when you have not normalized strings upstream.
If your requirement is that only semantic value matters and case-only differences should not create history, you must make those values compare equal before they reach the Dimension Node. Typical approaches include standardizing case in a Stage, View, or column mapping.
What You Cannot Configure Away in the UI
Dimension Advanced Deploy does not expose a setting to turn on case-insensitive Type 2 behavior or to normalize strings inside Coalesce merge comparison for all tracked columns.
Your package may document Exclude Columns from Merge on Snowflake Base Dimension behavior. That option applies to SCD Type 1 merge options: it excludes columns from comparison and update phases for that merge style. It is not a general control to ignore letter case for Type 2.
The DDL Case Insensitive option on External Data Nodes in the External Data Package concerns how column identifiers and load DDL are handled for certain file-based loads. It does not change how a Dimension Node compares cell values during SCD Type 2 refresh. Don't use that toggle expecting different merge value semantics for dimensions.
Normalize Strings Upstream
The supported approach is to apply a consistent expression to any business key or change-tracked attribute where casing would otherwise cause false change signals. Typical patterns use warehouse functions such as UPPER, LOWER, or INITCAP in the layer before the Dimension Node so both the existing target row and the incoming batch share the same canonical form.
Example patterns in a Stage or View select list:
UPPER(customer_name) AS customer_name,
LOWER(email) AS email,
INITCAP(product_title) AS product_title
Pick one standard per column and keep it stable so historic rows and new loads stay aligned.
Optional: Tracked Column Versus Display Column
If you must keep exact source casing for reporting but still want stable Type 2 detection, you can sometimes maintain the following:
- Tracked column: Normalized expression used in mappings and change detection.
- Display column: Raw or lightly transformed value for presentation.
This adds modeling and mapping overhead. Verify that business keys, grain, and Advanced Deploy selections still match your Kimball or internal standards and that the generated SQL does what you expect.
Trade-Offs
Use the following table to compare normalizing case upstream with leaving source strings as loaded.
| Approach | What you gain | What to watch for |
|---|---|---|
| Normalize upstream | Fewer spurious Type 2 rows from benign case churn | The stored value in the tracked column may not preserve original letter case unless you keep a separate display field |
| Leave values as loaded | Preserves literal strings from the source | Case-only flips can inflate history when the warehouse treats them as different values |
Troubleshooting
If case behavior causes unexpected Type 2 rows or you relied on the wrong UI control, start with the following table.
| Issue | Resolution |
|---|---|
| New Type 2 rows appear when only letter case changed on a change-tracked string | Normalize that column upstream with UPPER, LOWER, or INITCAP, or use warehouse-specific case-insensitive comparison in the layer that feeds the Dimension Node so equality matches your intent. |
| You expected DDL Case Insensitive to fix dimension merges | That setting applies to External Data Node load DDL and naming, not to Dimension Node value comparison. See External Data Package. |
What's Next?
- Nodes and Node Types for how Dimension Nodes use change tracking for Type 2.
- Snowflake Base Node Types Advanced Deploy for Dimension MERGE, change tracking columns, and Exclude Columns from Merge scope.
- BigQuery Base Node Types Advanced Deploy for BigQuery Dimension SCD Type 2 options.
- Handling Duplicate Records in SCD Type 2 Dimension Nodes if merge errors point to non-unique business keys rather than value comparison.
- Coalesce Foundational Hands-On Guide for hands-on Type 2 practice.
- External Data Package for DDL Case Insensitive on external loads.