Skip to main content

Migrating SQL to Coalesce with Copilot

    Overview

    Learn how to migrate existing SQL code into Coalesce using Copilot. Whether you have stored procedures, views, ad-hoc queries, or legacy transformation logic, Copilot can analyze your SQL and generate a DAG of Nodes with transformations, columns, and relationships preserved. This guide walks you through the process from preparation through validation and refinement, with copy-paste examples you can use.

    What You Can Migrate

    Copilot handles the following sources:

    SourceWhat to ExtractNotes
    Stored proceduresThe full procedurePaste the whole procedure. Copilot creates a Node for each CTE and the final INSERT. Add any uncaptured setup or cleanup to Pre-SQL or Post-SQL.
    ViewsThe SELECT statement from CREATE VIEW x ASOften the cleanest input; views are usually self-contained. Works best when the view has no external dependencies or when you already have matching source Nodes in your Workspace.
    Ad-hoc queriesThe full queryPaste as-is if it uses standard SQL.
    CTE-based scriptsThe full scriptCTEs map well to Stage Nodes; each CTE can become a Node.
    dbt modelsThe compiled SQL or model queryRemove Jinja; paste the resulting SQL.
    Legacy ETL codeTransformation logic from WhereScape, SAP HANA, Informatica, or similarExtract the SQL portions (SELECT, INSERT, MERGE, and similar statements; ignore control flow and scripting). Platform-specific syntax (date functions, string handling, and so on) may need adjustment for your target data platform.
    Best Fit

    Copilot excels at converting SQL with clear structure: CTEs, joins, aggregations, and standard transformations. The more modular your SQL, the cleaner the resulting DAG.

    Prerequisites

    Before you start, ensure you have:

    • Coalesce 7.27+ with CopilotEnable Copilot in your Workspace.
    • Workspace permissions to create and modify Nodes — Copilot creates and edits Nodes on your behalf; you need write access.
    • Source Nodes already in your Workspace — Copilot cannot create source Nodes. It matches table names in your SQL to existing sources; add your source tables first.
    • Storage Locations configured — Each generated Node needs a target schema. Without Storage Locations, Copilot can't assign where tables or views are deployed.
    • A version control checkpoint — Create a checkpoint (commit or save point) before starting. If Copilot produces unexpected results, you can discard changes and revert.

    Step 1: Prepare Your SQL

    Preparation means getting your SQL into a form Copilot can process efficiently. A few minutes of cleanup before pasting reduces manual fixes later and improves the quality of the generated DAG.

    Things you can do:

    • Gather your SQL in one place — Copy the full procedure, view definition, or script into a single file or editor. Copilot handles hundreds of lines; very large procedures may need to be split into logical chunks.
    • Simplify where possible — Remove or comment out session variables, temporary setup, or environment-specific logic that doesn't belong in the transformation.
    • Use clear table references — Avoid aliases like t.col when you can; use table_name.column_name instead. Aliases can break column lineage and require manual re-mapping after migration.
    • Match table names to your Workspace — Ensure table names in your SQL match the names of your existing source Nodes in Coalesce.
    • Resolve templating first — For dbt models, run dbt compile and paste the compiled SQL instead of Jinja; Copilot needs plain SQL.
    SQL Preparation Examples

    The following are examples of preparing SQL for migration.

    Migrating a Stored Procedure

    You can paste the entire stored procedure into Copilot. Copilot creates a Node for each CTE and for the final INSERT, so a procedure with two CTEs and an INSERT becomes three Nodes.

    Example procedure:

    CREATE OR REPLACE PROCEDURE refresh_daily_sales()
    RETURNS VARCHAR
    LANGUAGE SQL
    AS
    $$
    DECLARE
    run_date DATE DEFAULT CURRENT_DATE();
    BEGIN
    CREATE OR REPLACE TEMP TABLE temp_run_params AS
    SELECT run_date AS filter_date;

    INSERT INTO target_sales
    WITH filtered_orders AS (
    SELECT * FROM orders WHERE order_date = (SELECT filter_date FROM temp_run_params)
    ),
    aggregated AS (
    SELECT customer_id, SUM(amount) AS total FROM filtered_orders GROUP BY customer_id
    )
    SELECT * FROM aggregated;

    RETURN 'Success';
    END;
    $$;

    Paste this whole procedure into Copilot. Copilot creates a Node for filtered_orders, a Node for aggregated, and a Node for the final INSERT (three Nodes total).

    Add Setup and Cleanup (If Needed)

    Copilot creates Nodes from the CTEs and INSERT, but some procedure logic may not be captured—for example, temp table creation that runs before the transformation, or logging that runs after. Add that logic using Pre-SQL and Post-SQL:

    Procedure logic Copilot may not captureWhere to add it in Coalesce
    Temp table creation, session setupPre-SQL on the first Node in the pipeline
    Post-insert logging, cleanup, or auditPost-SQL on the final Node
    Dynamic SQL or cursor loopsPre-SQL, Post-SQL, or a separate Node

    Example: The procedure above creates temp_run_params before the transformation. The first Node (filtered_orders) references it in its WHERE clause. Add Pre-SQL on that Node so the temp table exists when the Node runs:

    CREATE OR REPLACE TEMP TABLE temp_run_params AS
    SELECT CURRENT_DATE() AS filter_date;

    For logging or audit steps (such as the RETURN 'Success'), add them to Post-SQL on the final Node.

    Migrating a View

    Step 1: Retrieve the View Definition

    Get the view's DDL from your data platform:

    • Snowflake: SELECT GET_DDL('VIEW', 'database.schema.view_name');
    • BigQuery: Query INFORMATION_SCHEMA.VIEWS for the view_definition column
    • Databricks: DESCRIBE EXTENDED catalog.schema.view_name (returns view text), or query INFORMATION_SCHEMA.VIEWS for the VIEW_DEFINITION column (Unity Catalog)
    • Microsoft Fabric: Use the view metadata in your workspace or script the view definition from the Fabric portal

    You'll get something like:

    CREATE OR REPLACE VIEW my_schema.customer_summary AS
    SELECT
    customer_id,
    customer_name,
    region,
    SUM(order_amount) AS total_spent
    FROM customers c
    LEFT JOIN orders o ON c.customer_id = o.customer_id
    WHERE o.order_date >= '2024-01-01'
    GROUP BY customer_id, customer_name, region;

    Step 2: Paste the View Into Copilot

    Paste the full view definition into Copilot with instructions such as: "Convert this view into a Coalesce View Node. Use my existing source Nodes CUSTOMERS and ORDERS." Copilot creates a View Node with the transformation logic and column mappings. If your source Node names differ from the table names in the view (for example, the view uses customers but your Node is SOURCE.CUSTOMERS), update the SQL to match your Workspace names or specify the mapping in your instructions.

    Migrating a dbt Model

    Step 1: Get Plain SQL

    dbt models use Jinja templating. Copilot needs plain SQL, so compile first:

    1. Run dbt compile in your project directory.
    2. Open the compiled SQL in target/compiled/<project>/models/<model_name>.sql.
    3. Copy the contents. All {{ ref() }}, {{ config() }}, and other Jinja will be resolved.

    Example: A dbt model like this:

    {{ config(materialized='view') }}
    SELECT
    {{ ref('stg_customers') }}.customer_id,
    {{ ref('stg_customers') }}.customer_name,
    SUM({{ ref('fct_orders') }}.amount) AS total_spent
    FROM {{ ref('stg_customers') }}
    LEFT JOIN {{ ref('fct_orders') }} ON ...
    GROUP BY 1, 2;

    Becomes plain SQL after compile:

    SELECT
    "stg_customers".customer_id,
    "stg_customers".customer_name,
    SUM("fct_orders".amount) AS total_spent
    FROM "database"."schema"."stg_customers" "stg_customers"
    LEFT JOIN "database"."schema"."fct_orders" "fct_orders" ON ...
    GROUP BY 1, 2;
    Stored Procedures and Session Logic

    Complex stored procedures sometimes include logic that Copilot doesn't capture, such as custom session variables or Pre-SQL and Post-SQL. Teams migrating large stored procedure estates often add missing logic to Pre-SQL and Post-SQL manually. Validate the generated Nodes against your original procedure behavior.

    Step 2: Create a Checkpoint and Open Copilot

    Before you start, create a safety net and open Copilot with the right settings.

    1. Create a version control checkpoint — In the Coalesce App, open the version control modal (Git icon) and create a checkpoint (commit or save point). If Copilot produces unexpected results, you can discard all changes and revert to this point.
    2. Open Copilot — Click the AI button Coalesce Copilot logo in the builder toolbar to open the Copilot chat panel.
    3. Enable Allow Edits — Ensure Allow Edits (Edit Mode) is turned on in the Copilot panel. If it's off, Copilot can only suggest changes; it won't create or modify Nodes. You must enable it for the migration workflow to work.
    The Coalesce app showing a New Chat for Copilot. The main panel shows a Copilot welcome message offering help with a project, graph questions, code changes, and best practices. At the bottom, there is a chat input field labeled “Chat with Copilot” and a blue send button. In the lower right corner, an Allow Edits toggle switch is turned on and highlighted with a red rectangle.

    Step 3: Paste Your Instructions and SQL

    Start a new chat and provide clear instructions before pasting your SQL. The instructions set the context so Copilot knows how to interpret your code.

    Example Instructions by Source Type

    For a stored procedure (CTE-based):

    Reverse engineer this SQL into Coalesce Nodes. Create a Stage Node for each CTE. Use my existing source Nodes INVENTORY_MOVEMENTS and PRODUCTS. Map to my WORK Storage Location.

    For a view definition:

    Convert this view definition into a Coalesce View Node. Use my existing source Node CUSTOMERS. Name the Node V_CUSTOMER_SUMMARY.

    For an ad-hoc query:

    Create Nodes from this query. Match CUSTOMERS and ORDERS to my existing sources. Create a Stage Node for the result.

    For a dbt-style model:

    Import this SQL as Coalesce Nodes. Each CTE should become a separate Stage Node. Use snake_case for Node names.

    What to Specify Upfront

    Include any of these in your first message:

    • Node Type preference: "Create Stage tables" (default; good for intermediate transformations), "Create Views only" (lighter weight; no materialization), or "Make the final Node a Fact table" (for final marts or reporting tables)
    • Naming convention: "Prefix all Nodes with STG_" or "Use my org's naming: dd_mm_yyyy format"
    • Target Subgraph: "Put all Nodes in the Inventory Subgraph"
    • Source mapping: "Use SOURCE.INVENTORY_MOVEMENTS and SOURCE.PRODUCTS as predecessors"

    Paste the SQL

    After your instructions, paste the SQL. Paste one script at a time; for multiple procedures or views, run separate Copilot sessions. Copilot will analyze it and create the corresponding Nodes. For a ~400-line stored procedure with multiple CTEs, Copilot typically generates a chain of Stage Nodes, each representing a logical step in your pipeline. Processing may take a minute or two for large inputs.

    Live Feedback

    Copilot shows progress as it works. You'll see Nodes appear in your Workspace and the graph update in real time.

    Step 4: Review the Generated Pipeline

    After Copilot finishes, validate the results before deploying. Work through each check systematically.

    Check the Graph and Lineage

    In the builder, the graph shows your pipeline as a DAG. If it doesn't update automatically after Copilot runs, reload it. Perform these checks:

    1. Reload the graph or lineage view if it doesn't update automatically.
    2. Verify your source Nodes connect to the new Nodes. There should be no orphaned Nodes (Nodes with no incoming or outgoing connections that don't belong).
    3. Trace the flow: each CTE or logical step in your SQL should map to a Node. The final Node should represent your final SELECT.

    Inspect Columns and Transforms

    Open each generated Node and review the columns. In the Node panel, switch to the column view (Columns dropdown). Perform these checks:

    1. Look for the transforms in the Transform column. Those show where Copilot placed transformation logic from your SQL.

    2. Compare column names and data types to your original SQL. Ensure business logic is preserved.

    3. Check that joins and aggregations are correct. Open the Join tab and Config tab as needed.

      A Coalesce interface displays a node named stg_weather_facts. The Join tab is selected, showing a SQL statement that pulls data from different sources, including forecast_hourly_metric. The bottom section indicates the pipeline is validated, with Create and Run buttons nearby.
      Check the Join tab
      Coalesce in Browser view with Column Grid selected. A table lists columns such as Column Name, Node Name, Storage Location, Transform, Source, and Nullable. The Transform column is outlined in red, showing transformation logic for certain fields, including a REGEXP_REPLACE expression and a function call for RUN_ID. A Run All button appears in the top right corner.
      Review the transforms

    Verify Column Lineage

    If you used table aliases in your SQL (for example, im.product_id instead of INVENTORY_MOVEMENTS.product_id), Copilot may have placed the expression in the Transform field instead of mapping directly to the Source column. That breaks column lineage.

    To fix: In the Coalesce UI, open the Node, find the affected columns, remove the transform, and map the column to the correct source column.

    Confirm Node Types and Storage

    • Node Types: By default, Copilot often creates Stage Nodes. Ensure they match your intent: Stage (intermediate tables), View (virtual, no materialization), Persistent Stage (materialized staging), Dimension (slowly changing dimensions), or Fact (fact tables for analytics).
    • Storage Locations: Verify each Node points to the correct Storage Location in the Node Config. Staging tables typically go to a WORK or STAGE schema; final tables may go to a different schema. Change the Storage Location in the Node Config if needed.

    Run a Quick Validation

    Before deploying, run these validation steps:

    1. Click Validate Select on a Node to ensure the SQL compiles.
    2. Run the Node (Build or Refresh) in a development environment to confirm the output matches expectations.
    3. Compare row counts or sample data to your original query if possible.
    Screenshot of a data modeling interface with the STG_LINEITEM node open under the Mapping tab. A table displays columns such as Column Name, Transform, Data Type, Source, Nullable, Description, and Default Value. Several fields are listed with data types like NUMBER, VARCHAR, and DATE, and one row shows a REGEXP_REPLACE transformation. In the bottom right corner, the Validate Select button is outlined in red next to Create and Run buttons.
    Review Before Deployment

    Examine all generated SQL, Node configurations, and transformations before deploying to production. Test changes in a development environment first. See Copilot Best Practices and Limitations for a full validation checklist.

    Step 5: Refine and Enhance

    Once the base pipeline is in place, you can use Copilot for follow-up changes. Ask in the Copilot chat; Copilot applies changes across the pipeline.

    Convert Node Types. Ask Copilot to change Stage tables to Views, or to convert the final Node into a Fact table.

    Standardize transformations. If you have inconsistent null handling (COALESCE, NVL, and so on) across columns, ask Copilot to apply a consistent pattern. For example: "For all VARCHAR columns, apply NVL for null handling and make it consistent."

    Rename columns. Improve clarity for people who consume your data downstream: "Rename any column containing 'sales AMT' to 'sales_amount'." Copilot propagates renames through downstream Nodes.

    Insert Nodes in the middle. To add a Persistent Stage between two existing Nodes, create the new Node, then ask Copilot to update the reference. For example: "Modify inventory_status so it references pstage_movements_filtered instead of movements_filtered." Copilot updates references and lineage.

    Adjust data types. In the Copilot chat, request conversions such as VARCHAR to STRING, or add validation to incoming columns. Verify changes in the Node panel afterward.

    Discard and retry. If the result isn't what you want, open the version control modal and discard all changes. Create a new checkpoint, then start a new Copilot conversation with revised instructions.

    Copilot and UI Together

    Use Copilot for bulk changes and structural updates, then switch to the Coalesce UI for fine-tuning individual column mappings and advanced Node configuration.

    Example: Stored Procedure (Inventory Pipeline)

    This example walks through migrating a CTE-based inventory pipeline. You'll paste instructions and SQL into Copilot, then validate the generated Nodes. Use it as a template for your own stored procedures.

    Before You Start

    • Checkpoint and Copilot — Complete Step 2: Create a Checkpoint and Open Copilot first. Create a version control checkpoint, open Copilot, and ensure Allow Edits is on.
    • Source Nodes — Create or add source Nodes for INVENTORY_MOVEMENTS and PRODUCTS in your Workspace. Copilot cannot create source Nodes; it matches table names in your SQL to existing sources.
    • Table name alignment — If your source Node names differ (for example, SOURCE.INVENTORY_MOVEMENTS vs INVENTORY_MOVEMENTS), update the SQL to match your Workspace or specify the mapping in your instructions.
    • Storage Location — Ensure your Workspace has a Storage Location configured (for example, WORK or STAGE) so Copilot can assign where tables are deployed.

    Step 1: Paste Instructions and SQL

    Copy the following instructions into a new Copilot chat, then paste the SQL below it:

    Reverse engineer this SQL into Coalesce Nodes. Create a Stage Node for each CTE. Use my existing source Nodes INVENTORY_MOVEMENTS and PRODUCTS. Map to my WORK Storage Location.

    -- Step 1: Filter to recent movements and join with product details
    WITH movements_filtered AS (
    SELECT
    im.movement_id,
    im.product_id,
    im.warehouse_id,
    im.quantity,
    im.movement_date,
    im.movement_type,
    p.product_name,
    p.category
    FROM INVENTORY_MOVEMENTS im
    JOIN PRODUCTS p ON im.product_id = p.product_id
    WHERE im.movement_date >= DATEADD(day, -90, CURRENT_DATE())
    ),

    -- Step 2: Aggregate movements by product and warehouse
    movements_by_product AS (
    SELECT
    product_id,
    warehouse_id,
    product_name,
    category,
    SUM(CASE WHEN movement_type = 'IN' THEN quantity ELSE -quantity END) AS net_quantity,
    COUNT(*) AS movement_count
    FROM movements_filtered
    GROUP BY product_id, warehouse_id, product_name, category
    ),

    -- Step 3: Calculate inventory status and flag low stock
    inventory_status AS (
    SELECT
    product_id,
    warehouse_id,
    product_name,
    category,
    net_quantity,
    movement_count,
    CASE
    WHEN net_quantity <= 0 THEN 'OUT_OF_STOCK'
    WHEN net_quantity < 10 THEN 'LOW_STOCK'
    ELSE 'IN_STOCK'
    END AS stock_status
    FROM movements_by_product
    )

    SELECT * FROM inventory_status
    ORDER BY stock_status, net_quantity ASC;

    Step 2: Review the Generated Pipeline

    After Copilot finishes, you should see three Stage Nodes (one per CTE) plus a final Node for the ordered result. Each Node connects to the previous one in the pipeline. Work through the checks in Step 4: Review the Generated Pipeline:

    1. Verify source Nodes connect to the new Nodes and the graph shows a clear flow.
    2. Open each Node and check the Transform column for correct column mappings; fix any alias-related transforms if needed.
    3. Click Validate Select on the final Node to ensure the SQL compiles.
    4. Run the Node in a development environment and compare output to your original procedure if possible.

    From there, you can convert the final Node to a View or Fact table, or add more transformations using Copilot or the UI.

    Example: Ad-Hoc Query (Customer Spending)

    This example shows how to migrate a standalone query with a join and aggregation. It's simpler than a stored procedure: one query becomes one (or a few) Nodes.

    What You Need

    • Checkpoint and Copilot — Complete Step 2: Create a Checkpoint and Open Copilot first. Create a version control checkpoint, open Copilot, and ensure Allow Edits is on.
    • Source Nodes — Create or add source Nodes for CUSTOMERS and ORDERS. Copilot cannot create source Nodes; it matches table names in your SQL to existing sources.
    • Table name alignment — Ensure table names in your SQL match your Workspace (for example, CUSTOMERS and ORDERS). If your sources use different names, adjust the SQL or specify the mapping in your instructions.

    Paste the Query

    Copy the following instructions into a new Copilot chat, then paste the SQL below it:

    Create Nodes from this query. Match CUSTOMERS and ORDERS to my existing sources. Create a Stage Node for the result.

    SELECT
    c.customer_id,
    c.customer_name,
    c.region,
    SUM(o.order_amount) AS total_spent,
    COUNT(o.order_id) AS order_count
    FROM CUSTOMERS c
    LEFT JOIN ORDERS o ON c.customer_id = o.customer_id
    WHERE o.order_date >= '2024-01-01'
    GROUP BY c.customer_id, c.customer_name, c.region
    ORDER BY total_spent DESC;

    Review the Generated Pipeline

    Copilot matches CUSTOMERS and ORDERS to your existing source Nodes and builds a transformation Node with the join and aggregation. You should see a single Stage Node (or a small chain) representing the query result. Work through the checks in Step 4: Review the Generated Pipeline:

    1. Verify the source Nodes connect to the new Node and the graph shows the correct lineage.
    2. Open the Node and check the Transform column; if you used aliases (c.customer_id, o.order_amount), Copilot may have placed expressions in the Transform field instead of mapping to source columns. Fix any broken lineage by re-mapping to the correct source column.
    3. Click Validate Select to ensure the SQL compiles.
    4. Run the Node and compare row counts or sample data to your original query.

    Common Pitfalls and Troubleshooting

    The following table lists common issues and how to resolve them:

    IssueResolution
    Table aliases break column lineage or lineage looks wrongAfter import, open the Node and use the column grid view to see all columns. Check for alias-related transforms; for each affected column, remove the transform (clear the Transform field) and re-map the column to the correct source column. You can fix multiple columns in bulk when using the column grid.
    Stored procedure session logic missingAdd custom session variables or setup to Pre-SQL and Post-SQL manually on the appropriate Node.
    Copilot doesn't capture everything in complex stored proceduresMigrate in smaller chunks (split the procedure into logical sections), or plan to supplement with manual edits after migration.
    Very large Nodes hit token limitsBreak the migration into smaller SQL segments. Alternatively, migrate incrementally and propagate only new columns to downstream Nodes instead of re-reading entire Nodes.
    Data type changes don't applyCopilot may report a change but not apply it; verify in the Node panel and fix manually if needed.
    Copilot stops respondingWait 1–2 minutes, then start a new conversation. A fresh context often helps. If it persists, check your connection or try again later.
    Results don't match expectationsRephrase your instructions with more specificity. Include Node Type preferences ("Create Stage tables" or "Create Views only"), naming conventions ("Prefix all Nodes with STG_"), and target Subgraph ("Put all Nodes in the Inventory Subgraph").
    Need to start overOpen the version control modal (Git icon), discard all changes, and create a new checkpoint before trying again.

    What's Next?