Skip to main content

Transform Onboarding Guide

This guide walks you through onboarding to Coalesce Transform, from account creation through team rollout. Whether you're an admin configuring integrations or a developer building your first pipeline, you'll find the steps you need here. For complete setup instructions and step-by-step configuration guides, see Setup and Configuration.

Who This Guide Is For

Transform onboarding involves two main roles:

Prerequisites

Before starting your Transform onboarding, ensure you have:

  • Access to your cloud data warehouse (Snowflake, Databricks, BigQuery, or Fabric)
  • A Git repository for version control (or plan to create one)
  • Basic understanding of SQL and data transformation concepts
  • Admin access to configure integrations and add team members
Browser Support

Google Chrome is the only supported browser for Coalesce. Update your browser before you begin.

Phase 1: Initial Setup

Phase 1 covers account creation, data platform connections, project, workspace setup, and version control. For detailed step-by-step guides, see Setup and Configuration.

Account and Access

New Transform accounts can be created through a trial or by contacting Coalesce.

Network requirements: Coalesce connects to your data platform from specific IP addresses. You must allow inbound traffic from Coalesce and outbound traffic to Coalesce domains. See Network Requirements for IP addresses and platform-specific setup (Snowflake network policies, Databricks egress policies, and so on).

User management: Organization Admins add team members and assign roles at the organization, project, and environment levels. For SSO and automated provisioning, see Authentication and User and Team Provisioning.

Connect Your Data Platform

Coalesce supports Snowflake, Databricks, BigQuery, and Fabric. Each platform has its own connection flow and credential options.

Credentials are configured per Workspace and per Environment. You'll enter account URLs, usernames, passwords, or keys, and optionally roles and warehouses. See Connection guides for step-by-step setup.

Create Your Project

A Project is the top-level container for your pipelines. Each Project has one Git repository and can contain multiple Workspaces and Environments.

  1. Create a new Project from the Coalesce dashboard.
  2. Configure Git integration: choose your provider (GitHub, GitLab, Bitbucket, or Azure DevOps), add your repository URL, and create a personal access token for authentication.
  3. Use one Git repository per Project. See Create Your Project and Set Up Version Control for details.

Create Your Workspace

A Workspace is your sandbox for development. You build Nodes, run previews, and validate changes before merging to the main branch and deploying.

  1. From your Project, click Create Workspace.
  2. Follow the Onboarding Wizard to configure storage locations and mappings.
  3. Connect the Workspace to your data platform using the credentials from Connect Your Data Platform.

See Create a Workspace and Storage Locations and Storage Mappings for full details. For Workspace strategy (one per branch versus one per developer), see Workspaces and Coalesce Best Practices.

Set Up Version Control

Version control is required before you can commit and deploy. You need your own Git provider account and a personal access token. Configure the following:

  • Git provider: Supported providers are listed in Git Integration.
  • Access token: Create a token for Coalesce (each person needs their own). See Set Up Your Git Integration.
  • Branch strategy: Decide how you'll use branches (for example, feature branches, main for deployment). See Git Branches.
Git and Workspace Pitfalls

Avoid these common issues:

  • Never develop directly in the main development Workspace.
  • Never merge into a Workspace that has uncommitted changes.
  • Only one person should work on a given branch at a time.
  • Designate a single developer to commit to the main branch to avoid conflicts.
  • Avoid making breaking commits directly to the main Git branch.
  • Only open one instance of the Git modal per Workspace at a time.
  • Do not change Git repository settings while there are uncommitted changes.

Phase 2: Build Your First Pipeline

Learn the Interface

The Build interface is where you design your pipeline. Key areas:

  • Projects dashboard: View and switch between Projects and Workspaces.
  • Build vs Deploy: Build is for designing and validating; Deploy pushes changes to your data platform.
  • Node graph: Visual representation of your pipeline. Nodes are connected by dependencies.
  • Sidebar: Edit Node properties, columns, and transforms.
  • Problem Scanner: Surfaces errors and warnings before you deploy.

See The Build Interface and What is Transform? for a tour.

Add Sources

Source Nodes represent tables that already exist in your data platform. You add them before building transformations.

  1. From the Build screen, click the + plus sign, then Add Sources.
  2. Select the tables you want to add, then Add Sources. You can preview each source before adding.
  3. Source Nodes appear on the graph. They cannot be modified; they represent raw data.

See Add a Data Source for details. For a guided walkthrough with sample data, see Snowflake Quick Start or Databricks Build Weather Analytics.

Build Transformations

Transformations are built using Nodes. Coalesce includes core Node types (Stage, Dimension, Fact, View, Custom) and the Marketplace offers many more for specific use cases such as incremental loading, data quality, and platform-specific patterns.

Before building from scratch, check the Marketplace for a Node that fits your needs. You may find a pre-built pattern that accelerates development. Core Node types give you a starting point:

  • Stage Nodes: Clean and prepare data. Use column transforms to rename columns, change data types, and apply filters.
  • Dimension Nodes: Store slowly changing dimensions. Choose Type 1 (overwrite) or Type 2 (historical) based on your needs.
  • Fact Nodes: Store transactional or aggregated facts. Use joins and aggregations.
  • View Nodes: Virtual tables; no physical storage.
  • Custom Nodes: Reusable patterns for advanced use cases.

See Transforms, Nodes, and Stage Nodes for details. For hands-on practice, see Coalesce Foundational Hands-On Guide.

Create, Run, and Preview

After building your pipeline, validate it before deploying.

  1. Create: Coalesce generates the SQL for each Node based on your configuration.
  2. Run: Execute the pipeline (or a subset) to populate tables in your development schema.
  3. Preview: Inspect data in each Node to verify transformations.

Use the Problem Scanner to catch errors early.

Phase 3: Deploy and Operate

Create Environments

Environments define where you deploy: development, QA, and production. Each Environment has its own Storage Mappings, authentication, and parameters.

  1. In your Workspace, go to Build Settings > Environments.
  2. Create Environments for DEV, QA, and Production (or your naming convention).
  3. For each Environment, configure authentication (username and password, OAuth, or key pair) and storage mappings (database and schema).
  4. Set parameters if needed (for example, for runtime parameters).

See Create Your Environments and Environments for full setup. Each Environment should map to a distinct database and schema.

Deploy

Deployment creates or updates the physical objects (tables, views) in your data platform. Coalesce compares your current Environment state to your desired configuration and executes the necessary DDL.

  1. Ensure your Workspace is on the main branch and has no uncommitted changes when deploying to production.
  2. Go to Deploy and select your target Environment.
  3. Review the deployment plan, then deploy. You can deploy using the Coalesce App, CLI, or third-party tools.

See Deployment Overview and Deploying to an Environment for details. For troubleshooting (storage mapping errors, plan failures, timeouts), see Troubleshooting Deployments and Refreshes.

Refresh and Jobs

Refresh runs the data transformations (DML) defined in your pipeline. Jobs let you run subsets of Nodes on a schedule.

  1. Create Jobs on the Build page by selecting the Nodes you want to include.
  2. Deploy before refreshing; Jobs can only run on deployed Nodes.
  3. Schedule refreshes using the Coalesce Scheduler, CLI, Jobs API, or external tools such as Airflow and GitHub Actions.

See Refresh Your Pipeline and Parameters for RTP and other options. For third-party orchestration, see Third-Party DevOps Tools.

Phase 4: Team Rollout and Best Practices

Workspace Strategy

Choose a strategy that fits your team size and workflow:

  • One Workspace per branch: Create a new Workspace for each feature branch; delete when merged. Keeps work isolated.
  • One Workspace per user: Each developer has their own Workspace; they manage branches from it. Good for smaller teams.
  • One Workspace per feature: Separate Workspaces for distinct features. Reduces conflicts.

Deploy to production only from the main branch in the Main Workspace. See Workspaces and Coalesce Best Practices.

Git Practices

  • Commit frequently: Use meaningful commit descriptions. Each commit should cover a single unit of work.
  • Branch for development: Keep in-progress work on feature branches.
  • Communicate before committing: Ensure the Workspace is in a commit-ready state. Only one person per branch at a time.
  • Single committer for main: Designate one developer to commit to the main branch to avoid merge conflicts.

See Git Commits and Git Branches for details. For merge conflicts and sync issues, see Git Integration and related troubleshooting docs.

Phased Rollout

Roll out Transform in stages:

  1. Core data engineers: Complete setup, build first pipeline, deploy to DEV and prod.
  2. Broader data team: Add developers, establish workflows and standards.
  3. Analysts and consumers: If applicable, enable read-only or limited access for downstream users. See RBAC Roles and Permissions for role options.

Optional: Advanced Paths

AI Features

  • Copilot: Use natural language to generate transformations. Paste SQL or describe what you want; Copilot creates Nodes. See Copilot and Migrating SQL to Coalesce with Copilot.
  • AI-generated descriptions: Add descriptions to Nodes and columns for documentation and lineage.

See Coalesce AI for the full set of AI capabilities.

Programmatic Setup

  • Project APIs: Automate project, Workspace, and Environment creation. See API documentation.
  • CLI: Deploy and refresh from the command line. See CLI.
  • Automation: Use APIs and CLI for CI/CD, workspace provisioning, and environment management.

Integrations

Best Practices Summary

Before and during onboarding, follow these guidelines:

Get Help

Support Channels

  • Shared Slack or Teams channel: Dedicated channel for your team and Coalesce Customer Success.
  • Email: support@coalesce.io for quick assistance.
  • In-app support: Click the question mark icon for the AI Assistant, or Get Help to open an email to support.

When contacting support, include your Environment ID, run ID, and error details. Use Copy All to Clipboard in the app to capture system information. See Contacting Support for full details.

Self-Service Resources

What's Next?