Skip to main content

Coalesce Best Practices

Before Starting With Coalesce

  • You'll need access to Snowflake.
    • The data in Snowflake needs to be set to READ for sources. A source is the data Coalesce will read for data transformations.
    • Targets in Snowflake need to be READ/WRITE. A target is where Coalesce will write data transformations.
  • Google Chrome is the only supported browser.
  • Sign up for Coalesce.
  • To follow tutorials, install the Snowflake sample data. It's included if you're on a Snowflake trial.

Organization Setup

Version Control

  • Configure Version Control
    • Choose a provider.
    • Each user will need their own provider account and create a personal access token for Coalesce.
    • Each user will need to be in the organization's account.
    • It's recommended to create a new repository for Coalesce.

Git Commits

  • Make frequent commits using meaningful commit descriptions.
  • Keep your commits simple. They should be for solving a single unit of work or single task. This makes it easier to revert changes.
  • Ensure that all members of the development team are aware that a commit is planned, so that all code for the Workspace is in a ‘commit ready' state.
  • Consider allocating a single developer the task of performing commits to the Main branch, to avoid any possible conflicts.
  • Avoid making breaking commits to your main Git branch.
Having Trouble Making Commits

If you cannot commit your changes in a Workspace to Git, contact Coalesce support for help. Potentially overwriting your live metadata with a previous commit will cause you to lose all recent uncommitted development.

Git Branches

  • Use a branching strategy.
  • Use branches for development to keep work in progress away from the main branch.
  • Only deploy to target Environments from a ‘Main' branch, rather than from feature branches.
  • One person at a time should work on a branch.

Workspaces and Git

  • Do not have more than one instance of the Git modal open for a single Workspace, at the same time.
  • Never develop directly in your main development Workspace.
  • Never merge into a Workspace which has uncommitted changes.
  • Never change the Git repo settings when having uncommitted changes in ANY Workspace.
  • All development Workspaces should be ideally mapped to different schemas (except perhaps source nodes). You can't have a Workspace AND Environment on the same set of Snowflake schemas, unless the Workspace is used as read-only (which cannot be enforced).
  • All Environments should be mapped to different schemas, except for source nodes in some cases.

Project Setup

  • An Organization Administrator will need to create a new Project and configure Git during the process.
    • One repository per project is recommended.
  • Create a newWorkspace in the Project. This will be the main branch.
  • Create your Environments.
    • Each environment should have its own Database and Schema. As a feature flows from a main development Workspace to Test, QA, and finally to the Prod environment, the storage mappings should point to the Snowflake locations where you want to create the objects in question in your pipeline.
    • A good idea is to create environments for DEV, QA, and Production matching the Storage Mappings.
    • You'll have to connect each environment to Snowflake Authentication. .
  • Have each user join the Project they'll be working on and connect to their Git accounts.

Workspaces

  • Create a Workspace strategy. Some approached you can take are:
    • Use one Workspace per branch. Each time you create a new branch, create a new Workspace and delete old ones.
    • Use one Workspace per user. Each user works in their own Workspace. They manage their branches from the Workspace.
    • Create a separate Workspace for each feature branch. This ensures that developers work within distinct Snowflake locations, preventing conflicts and streamlining the development process.
  • Decide how you want to connect your Workspace to Snowflake Authentication.
  • Configure your Storage Location and Mappings. A good rule is to create target schemas in Snowflake for DEV, QA, and Production. Then map them in Coalesce.
  • When feature development in a specific Workspace is complete, it should be merged into the main branch and checked out by the Main Workspace when preparing for deployment into higher Environments.
  • It is best practice to deploy to higher (Production, UAT, Testing) Environments from the main Git branch in the Main Workspace.
Project list with main workspace and feature workspaces

CTEs or Pipelines for Complex Data Processing

Coalesce recommends the use of pipelines vs. Common Table Expressions, or CTEs, for building complex processing with SQL in an analytical environment.

The differences in the CTE vs. pipelines approaches, a contextual example comparing the two, and details on why Coalesce recommends the pipeline approach vs. the CTE approach can be found in our article, CTEs vs Pipelines for Complex Data Processing.

CTEs can be used in Coalesce, however, should you have a use case that requires them. The video below showcases how to leverage CTEs within your Coalesce data projects.

Deploy Your Pipeline

Refresh Your Pipeline