Skip to main content

Coalesce Copilot

Preview

This feature is currently available in a private preview and may not be accessible to all customers.

Coalesce Copilot is an AI assistant integrated into the Coalesce platform. It helps you build and modify data transformation pipelines using natural language. Instead of manually configuring Nodes through the Coalesce UI, you can describe your requirements in plain language. Copilot translates them into working DAG Nodes with transformations, columns, and relationships.

Think of it as a proficient Coalesce teammate. You describe your data needs in plain English or paste SQL, and Copilot translates that into actual DAG Nodes with proper transformations, columns, and relationships.

Core Capabilities

Copilot can perform the following operations:

Build and Modify Nodes

  • Create Nodes from natural language descriptions.
  • Add, remove, or update columns in existing Nodes.
  • Configure joins and relationships between Nodes.
  • Set business keys and system columns.

Import SQL

  • Convert existing SQL queries into Coalesce DAG Nodes.
  • Automatically create predecessor Nodes and dependencies.
  • Parse complex joins and aggregations into proper Node structures.

Answer Questions

  • Explain Workspace structure and Node configurations.
  • Provide guidance on Coalesce best practices.
  • Describe Node dependencies and column lineage.

Multi-Step Operations

  • Execute complex workflows requiring multiple Nodes.
  • Build complete pipelines from high-level requirements.

Platform Awareness

  • Adapts SQL syntax for Snowflake or Databricks.
  • Understands Coalesce-specific Node Types such as Stage, Dimension, Fact, View, and Persistent Stage.
  • Respects your Workspace context and existing Nodes.

Prerequisites

Before using Copilot, ensure you have:

  • Access to Coalesce version 7.27 or later.
  • Active Workspace permissions to create and modify Nodes.
  • Version control checkpoint created before starting work with Copilot. This is recommended for easy rollback.
  • Available Storage Locations, databases and schemas, configured in your Environment.

Enable Copilot

Organization administrators can go to Org Settings > Preferences and turn the features on or off for the whole organizations. These features are not enabled by default, reach out to your Coalesce account manager to sign up.

A screenshot of the Coalesce Org Settings page open to the Preferences section. The page shows a Parser Sample Size setting at the top with a disabled Save button. Below is an AI Features section with explanatory text and three feature toggles: Node and Column Descriptions, Auto Generated Git Commit Messages, and Coalesce Copilot. All toggles are shown in the disabled state.

On your build tab, click the AI button .

A screenshot of the Coalesce DAG Exploration page showing a visual graph of data nodes. The left panel lists nodes and data stores in an expandable tree. The main canvas displays connected blocks such as LAND_OUTFIT, DS_OUTFIT, VWA_CAMPAIGN, and others, arranged in a flow layout. At the top are filters for including or excluding nodes, view options, and a Run All button. A red arrow points to the top right corner highlighting the blue sparkle AI Assist button.

Operating Modes

Copilot has two operating modes that determine whether it modifies your Workspace or just provides information.

Edit Mode

Copilot makes actual changes to your Workspace. Use this mode when you want to build or modify Nodes.

What it does:

  • Creates new Nodes and updates existing ones.
  • Imports SQL and executes multi-step operations.
  • Makes actual changes to your Workspace.

Example prompts:

  • "Create a customer dimension with these columns..."
  • "Add a lifetime_value column to dim_customer"
  • [Pastes SQL query]
  • "Build a complete customer 360 pipeline..."

Non-Edit Mode

Copilot provides answers, explanations, and recommendations without modifying your Workspace. This mode is safe for exploration and learning.

What it does:

  • Answers questions about your Workspace.
  • Explains configurations and best practices.
  • Provides guidance without making changes.

Example prompts:

  • "What columns are in the sales_fact table?"
  • "How should I structure a slowly changing dimension?"
  • "Which Nodes depend on customer_dim?"
  • "What's the difference between Persistent Stage and View?"

Basic Workflow

  1. Create a version control checkpoint before starting with Copilot.
  2. Open the Copilot chat in the Coalesce builder.
  3. Type what you want to build or ask a question.
  4. Review Copilot's actions in your Workspace.
  5. Refine through follow-up messages if needed.
  6. Deploy when ready.

How to Communicate With Copilot

Copilot appears as a chat window in the Coalesce builder. Type your request naturally with no special syntax needed. The conversation is contextual, so you can have back-and-forth exchanges:

You: "Create a customer dimension"

Copilot: Creates dim_customer Node "I've created a customer dimension table. What columns would you like?"

You: "Add customer_id, name, email, and signup_date"

Copilot: Adds the columns "I've added those columns. Would you like me to set customer_id as the business key?"

Writing Effective Prompts

  • Be specific about requirements

    • ❌ "Create a table"
    • ✅ "Create a dimension table for customers with customer_id as the business key"
  • Break complex tasks into smaller steps

    • Start with basic structure, then add complexity iteratively. For example, first create the base dimension, then add calculated columns, then configure slowly changing dimension logic.
  • Provide context and examples

    • ❌ "Add columns"
    • ✅ "Add columns for first_name, last_name, and email to the dim_customer Node"
    • Reference existing Nodes: "Create a Node like dim_product but for stores"
  • Use domain-specific terminology

    • Specify Node Types when relevant such as Dimension, Fact, or Persistent Stage.
    • Reference Coalesce concepts like business keys, join conditions, and Subgraphs.
    • Include your data platform specifics when needed, such as Snowflake-specific functions or Databricks syntax.
  • Use iterative refinement Start broad, then narrow with follow-up prompts. Example workflow:

    1. "Create a customer dimension"
    2. Review the generated Node
    3. "Add loyalty_tier and lifetime_value columns"
    4. "Set customer_id as the business key"

Example Prompt Workflows in Copilot

Copilot adapts to different working styles. This section includes prompt examples showing how to build incrementally, start from templates, describe requirements, or import existing code. Choose the approach that fits your workflow.

  • Start simple and add details - This approach works well when you're exploring or aren't sure of all requirements upfront.

    • "Create a basic customer dimension".
    • Review the Node Copilot creates.
    • "Add loyalty_tier and lifetime_value columns".
    • "Set customer_id as the business key".
  • Copy existing structures - Use this when you want consistency across similar Nodes or need to replicate a pattern.

    • "Create a Node like dim_product but for stores"
    • Copilot copies the structure.
    • "Update the columns to match store attributes"
  • Describe your end goal - This is fastest when you know exactly what you need and want Copilot to figure out the implementation.

    • "I need to calculate monthly recurring revenue by customer segment"
    • Copilot builds the entire pipeline.
    • Review and approve or request changes.
  • Import existing SQL - Use this to migrate existing queries or logic into Coalesce.

    • Paste your existing SQL query into the chat.
    • Copilot analyzes and creates necessary Nodes.
    • Review the generated DAG structure.
    • Adjust join logic or transformations as needed.

Example 1: Creating a Dimension Node

Prompt:

"Create a dimension table for customers with customer_id as the business key. Include columns for customer_id, first_name, last_name, email, and signup_date."

What Copilot does:

  • Creates a Dimension Node Type.
  • Adds specified columns with appropriate data types.
  • Sets customer_id as the business key.
  • Configures necessary predecessor relationships.

Example 2: Importing SQL

You can add SQL directly into Copilot.

Prompt:

SELECT
c.customer_id,
c.name,
SUM(o.amount) as total_spent
FROM customers c
JOIN orders o ON c.id = o.customer_id
GROUP BY c.customer_id, c.name

What Copilot does:

  • Analyzes the SQL query.
  • Creates predecessor Nodes for customers and orders.
  • Builds a transformation Node with the join.
  • Sets up the aggregation logic.
  • Configures column mappings.
A screenshot of a Coalesce Copilot panel titled Customer Spending Analysis with SQL Query. A blue code box on the right displays a SQL query that selects customer information and total spending. Below, Copilot explains how it will convert the SQL into Coalesce nodes and shows two actions: Read Nodes with two matched nodes and Create Node for TARGET.CUSTOMER_SPENDING. A summary describes creating the CUSTOMER_SPENDING node by joining customers and orders. At the bottom is a chat input labeled Chat with Copilot and an Allow Edits toggle set to on.

Example 3: Multi-Step Pipeline

Prompt:

"Create a complete customer 360 view starting with raw customer data, add their order history, calculate lifetime value, and join with marketing campaign responses."

What Copilot does:

  • Creates staging Nodes for raw data sources.
  • Builds dimension tables for customers.
  • Creates fact tables for orders and campaign responses.
  • Adds calculated columns for lifetime value.
  • Wires all Nodes together with proper relationships.

Example 4: Modifying Existing Nodes

Prompt:

"Add a column called lifetime_value to dim_customer that calculates the sum of all order amounts for each customer."

What Copilot does:

  • Finds the dim_customer Node.
  • Adds the new column.
  • Writes the aggregation logic.
  • References the appropriate predecessor Node for order data.

What Copilot Knows About Your Workspace

Copilot has awareness of:

  • Your current Workspace: All existing Nodes, their columns, and relationships.
  • Your data platform: Whether you're using Snowflake or Databricks and adapts SQL syntax accordingly.
  • Available Node Types: How to build Dimension versus Fact versus View versus Persistent Stage.
  • Your Storage Locations: Which databases and schemas are available.
  • Your viewing context: Understands what you're currently looking at.

What Copilot doesn't know:

  • The actual data in your tables. Copilot only sees metadata like Node names, columns, and schemas.
  • Data quality issues unless you describe them.
  • Query performance metrics or optimization opportunities based on runtime data.

To get the best results:

  • Keep relevant Nodes visible in your Workspace.
  • Reference specific Node names using clear identifiers.
  • Mention target Subgraphs when organizing Nodes.
  • Ask questions about existing Nodes: "What columns are in sales_fact?"
  • Request explanations: "Why did you choose a Persistent Stage instead of a View?"
  • Get recommendations: "What's the best Node Type for aggregated metrics?"

When to Use Copilot Versus the Coalesce UI

Use Copilot When

✅ Building new Nodes from scratch

✅ Converting existing SQL to Coalesce

✅ Making bulk updates to multiple Nodes

✅ Learning Coalesce or exploring new features

✅ Rapid prototyping and iteration

✅ You have clear requirements and want to save time

✅ Working with standard Coalesce patterns

Use the Coalesce UI When

✅ Fine-tuning individual column transformations with autocomplete

✅ Visual exploration of complex DAGs

✅ Precise drag-and-drop column mapping

✅ Detailed configuration of advanced Node properties

✅ You prefer visual interfaces over conversation

✅ Making small tweaks where clicking is faster than typing

Best Approach

Use both. Let Copilot build the structure quickly, then use the Coalesce UI for fine-tuning and visual validation. Many users describe it as "Copilot for speed, Coalesce UI for polish."


What's Next?