Skip to main content

Coalesce Best Practices

Before Starting With Coalesce

Before working with Coalesce, ensure you meet all prerequisites.

Databricks Setup

Snowflake Setup

  • Ensure Snowflake data is set to READ for source tables (data read by Coalesce).
  • Ensure targets are set to READ/WRITE (where Coalesce writes transformed data).

Organization Setup

Version Control

Make sure you’ve configured version control before working in Coalesce:

  • Choose a provider: Supported Git providers are listed here.
  • Each user must:
    • Have their own provider account.
    • Create a personal access token for Coalesce.
    • Belong to the organization’s Git account.
  • We recommend creating a new Git repository for each Coalesce project.

Git Commits

  • Make frequent commits using meaningful commit descriptions.
  • Keep your commits focused—each one should cover a single unit of work.
  • Communicate with your team before committing to ensure the Workspace is in a commit-ready state.
  • Designate a single developer to commit to the Main branch to avoid conflicts.
  • Avoid making breaking commits directly to the main Git branch.
Having Trouble Making Commits

If you cannot commit your changes in a Workspace to Git, contact Coalesce support for help. Potentially overwriting your live metadata with a previous commit will cause you to lose all recent uncommitted development.

Git Branches

  • Follow a branching strategy.
  • Use branches for development to keep in-progress work separate from the main branch.
  • Only deploy to target Environments from the Main branch.
  • Only one person should work on a given branch at a time.

Workspaces and Git

  • Only open one instance of the Git modal per Workspace at a time.
  • Never develop directly in your main development Workspace.
  • Never merge into a Workspace that has uncommitted changes.
  • Do not change Git repository settings while there are uncommitted changes in any Workspace.
  • Workspaces should map to different schemas, except for source Nodes.

Project Setup

Setting up your project correctly from the start helps avoid long-term issues:

  • Create a New Project:
    • An Organization Administrator must create a new Project inside Coalesce.
    • During Project creation, Git integration should be configured.
    • Have each user join the Project they'll be working on and connect to their Git accounts.
  • Repository Setup:
    • One Git repository per Project is highly recommended for clean separation.
  • Create a Workspace:
    • After the Project is created, create a Workspace for development work.
    • Create a Workspace that holds the final project. This will be your main branch.
  • Schema Planning:
    • Define your source and target schemas early.

Workspaces

  • Workspace Strategies:
    • One Workspace per branch: Create a new Workspace for each new branch and delete old ones.
    • One Workspace per user: Each user works in their own Workspace and manages branches from there.
    • Separate Workspace for each feature branch: Ensures developers work within distinct locations, preventing conflicts.
  • Authentication:
    • Decide how to connect your Workspace to Coalesce.
  • Storage Location and Mappings:
    • Create target schemas in Snowflake for DEV, QA, and Production. Map them in Coalesce.
  • Feature Development:
    • Upon completion in a Workspace, merge into the main branch.
    • Check out by the Main Workspace when preparing for deployment into higher Environments.
    • Deploy to higher Environments (Production, UAT, Testing) from the main Git branch in the Main Workspace.

Environment Management

  • Environments should map to different schemas.
  • You cannot have a Workspace and Environment share the same schemas unless the Workspace is used read-only. (This cannot be enforced automatically.)
    • A good idea is to create environments for DEV, QA, and Production matching the Storage Mappings.
  • Each environment should have its own Database and Schema. As a feature flows from a main development Workspace to Test, QA, and finally to the Prod environment, the storage mappings should point to the Snowflake locations where you want to create the objects in question in your pipeline.
  • Each Environment and Workspace will need authentication.

Deploy Your Pipeline

Refresh Your Pipeline

CTEs or Pipelines for Complex Data Processing

Coalesce recommends the use of pipelines vs. Common Table Expressions, or CTEs, for building complex processing with SQL in an analytical environment.

The differences in the CTE vs. pipelines approaches, a contextual example comparing the two, and details on why Coalesce recommends the pipeline approach vs. the CTE approach can be found in our article, CTEs vs Pipelines for Complex Data Processing.

CTEs can be used in Coalesce, however, should you have a use case that requires them. The video below showcases how to leverage CTEs within your Coalesce data projects.