How To Build Your Data Pipelines

Use these how to guides to help you build your Coalesce integration.

📄️ Include Multiple SQL Statements in the Override SQL Part of a View Node

Overcome Coalesce view node limitations by executing multiple SQL statements using stage-based override techniques. Learn to implement sequential SQL execution within override create SQL sections, manage statement count restrictions, and deploy complex view logic with proper staging patterns. Essential guide for advanced view node customization and multi-statement SQL deployment.

📄️ Use Snowflake User Defined Functions (UDF)

Integrate Snowflake User Defined Functions (UDF) seamlessly into Coalesce data transformation workflows using ref_no_link references and storage location mapping. Learn to implement custom SQL functions within Coalesce nodes, configure UDF arguments and parameters, set up proper storage location mappings, and leverage custom business logic for advanced data transformations and calculations.

📄️ Filter Node Output with SELECT Statements

Learn proper data filtering techniques in Coalesce using SELECT statements and WHERE conditions. Understand how data flows work in Coalesce pipelines, why filtering should be applied in consuming nodes rather than publishing nodes, and implement best practices for row limiting logic. Master efficient data filtering strategies for optimized query performance and accurate data transformation results.

📄️ Use the Coalesce Test Stage Macro

Create custom data validation tests in Coalesce using the powerful test_stage() macro for advanced data quality control beyond built-in testing capabilities. Learn to implement reusable custom test logic, configure failure handling behavior, author SQL template-based tests, and build sophisticated data validation nodes for comprehensive pipeline testing and quality assurance.

📄️ Fix Un-Linked References in Sub-Graphs in Coalesce

Troubleshoot and resolve un-linked reference issues in Coalesce Sub-Graphs with proven solutions. Learn how to use ref_link() and ref_no_link() macros to create proper visual dependencies, implement commented table references for metadata recognition, and ensure accurate sub-graph visualization. Essential guide for maintaining clear data lineage and dependency mapping in complex pipeline organizations.

📄️ Load DISTINCT Values Into a Node

Load distinct values into Coalesce nodes using multiple proven techniques including View node DISTINCT toggles, GROUP BY ALL clauses, and custom node type creation. Master data deduplication strategies for Snowflake and Databricks platforms, implement SELECT DISTINCT behavior across different node types, and eliminate duplicate records efficiently in your data transformation pipelines.

📄️ Preserving Column Lineage When Using SQL Statements

Preserve column lineage visibility in Coalesce when using complex SQL statements by implementing qualified column naming conventions. Learn best practices for maintaining data lineage traceability across multi-column SQL operations, use node name qualification for lineage preservation, and leverage autocomplete features for efficient qualified column references in transformations.

📄️ Self-Join a Coalesce Node to Itself

Implement self-joins in Coalesce data pipelines using ref_no_link functions to connect nodes to themselves for hierarchical data structures and comparative analysis. Learn proper aliasing techniques, configure join relationships for recursive data patterns, and build self-referencing transformations for parent-child relationships, gap analysis, and data comparison scenarios within single datasets.

📄️ How to use GROUP BY

Master GROUP BY operations in Coalesce for effective data aggregation and fact table creation. Learn how to implement SQL GROUP BY clauses in the Join tab, configure aggregation transforms in the mapping view, and create complex fact tables with proper dimension joins. Comprehensive guide with step-by-step examples for building aggregated data models and summary tables.

📄️ Migrate a CTE to Coalesce

Migrate Common Table Expressions (CTEs) to Coalesce's pipeline-based architecture for improved data transformation scalability and maintainability. Learn step-by-step CTE conversion techniques, identify data sources and dependencies, implement sequential migration patterns, and transform complex CTE logic into modular Coalesce nodes with enhanced lineage tracking and column-level visibility.