Quick Start Guide

This page will help you quickly get started with Coalesce, connecting your database and setting up a staging node with a simple data transformation. Optionally, we will also walk you through the creation of Dimension and Fact tables.

🚧

Before You Start

  • If you haven't yet signed up for a Coalesce account, you can do so here.
  • Make sure you're using Google's Chrome browser, as other browsers are not officially supported.
  • Have your Snowflake login information handy.

Interface Overview

When you first sign in you'll be presented with the Build interface. This is where you'll spend most of your time creating nodes, building a graph with them, and transforming your data. The other interface is the Deploy interface, which will be used to export your transformation data pipeline, including your graph and other metadata.

📘

The Problem Scanner

Note that on a brand new account, the Problem Scanner will show a few action items. You can disregard those as most will be resolved by the end of this guide.

The Problem ScannerThe Problem Scanner

The Problem Scanner

Setup

Connect to Snowflake

  1. Click on Build Settings, which is represented by a cogwheel icon toward the bottom left
Location of Build SettingsLocation of Build Settings

Location of Build Settings

  1. Edit an Environment by clicking the pencil icon to the right of it.

  2. From the Edit Environment screen Settings Account Add your Snowflake URL

The Edit Environment ScreenThe Edit Environment Screen

The Edit Environment Screen

  1. From the Edit Environment screen User Credentials and fill out the form with your Snowflake login
  2. Click Test Connection to ensure your credentials work as expected
  3. Click Save

You've now connected Coalesce to your Snowflake instance, and will notice that the Problem Scanner only shows one item - no git connection information. This is something you can set up in Git Settings now or after this guide.

Configure Storage Locations & Environments

A storage location is a logical name you provide to represent a database and schema (in Snowflake) and you will need them to make use of the Environment you configured earlier.

  1. Go to Build Settings and click on New Storage Location to create a new location
  2. You can name this anything you'd like, but as this will be your data source, naming it SRC or SOURCE makes sense
  3. Make another Storage Location that will be your target, so feel free to name it TARGET or DESTINATION
  4. Edit your Environment Storage Mappings
  5. From here enter the actual databases and schemas that will correspond to each Storage Location. For this guide we will be using Snowflake's sample schema TPCH_SF1 as our data source. Feel free to use any schema you'd like as the target for the transformed data.
Example Configuration of Storage MappingsExample Configuration of Storage Mappings

Example Configuration of Storage Mappings

Build an Example Transformation Pipeline

Add Graph Sources

Now it's time to add Sources to the graph. The graph node view is where you'll configure Nodes that will transform your data. Below is an example of a graph with several nodes -

Example GraphExample Graph

Example Graph

To add Source Nodes take the following steps:

  1. Expand Nodes from the Left Sidebar (if not already open)
Nodes on Left SidebarNodes on Left Sidebar

Nodes on Left Sidebar

  1. Click on the + sign Add Sources
Adding SourcesAdding Sources

Adding Sources

  1. Choose your source tables on the left and click Add Sources on the bottom right to add the sources to your pipeline
Adding Sources to PipelineAdding Sources to Pipeline

Adding Sources to Pipeline

  1. You'll now see your graph populated with some Source Nodes
Graph Populated with 8 Source NodesGraph Populated with 8 Source Nodes

Graph Populated with 8 Source Nodes

Make a Stage Node to Transform Your Data

Now that you have a few Source Nodes on your graph, it's time to add a Stage Node

  1. Add one or more Stage Nodes by right clicking your Nation Source Node Add Node Stage Node. Note that you can select multiple Source Nodes by Shift+clicking them and then add multiple Stage Nodes simultaneously.
  2. Double click on the Stage Node or right-click Edit to open up the Node Editor
  3. Open Node Properties on the right and change the Location to the Target you configured earlier
Node PropertiesNode Properties

Node Properties

  1. Click Create to create a table in Snowflake
  2. Click Run to populate the table. Note that you haven't transformed the data yet!
  3. Edit the Transform field in the Mapping grid by double clicking in the transform field of the N_NAME column. Try a simple transform like LOWER() and the name of your column, or you can use the syntax LOWER({{SRC}})
  4. Click Run again to transform the data

📘

On Transforms

Any SnowSQL transform can be used to transform your data.

Example of Transformed Data.  The entries in N_NAME are now lowercase.Example of Transformed Data.  The entries in N_NAME are now lowercase.

Example of Transformed Data. The entries in N_NAME are now lowercase.

  1. You'll see a preview of your transformed data in the lower half of the screen, but feel free to take a peek in your Snowflake database to confirm.

Congratulations! You've connected your database and applied a basic transformation to your data. Feel free to continue experimenting with some of the other node types below.

Create a Dimension Table

Now let's create a dimension table.

  1. Create a new Stage Node from the CUSTOMER Source Node
  2. Go into this new node to Create and Run
  3. Return to the main graph and create a Dimension node from the STG_CUSTOMER node. By default, Coalesce creates a (Slowly Changing)Type 2 Dimension. In this guide we will be making a Type 1 Dimension.
Your graph should look like this nowYour graph should look like this now

Your graph should look like this now

  1. Go into the new DIM_CUSTOMER node
  2. Open up Options on the right side
  3. Choose C_CUSTKEY as a business key by selecting it and clicking the arrow to move it to the right
Choosing `C_CUSTKEY` as the **Business Key**Choosing `C_CUSTKEY` as the **Business Key**

Choosing C_CUSTKEY as the Business Key

  1. Now Create and Run the DIM_CUSTOMER node

📘

Type 1 vs Type 2

In the Dimension Node, if no Change Tracking columns are selected, the node will act as a Type 1 Dimension. If Change Tracking columns are selected, it will act as a Type 2.

You have now finished creating a Type 1 Dimension table. Next we will make a Fact Table.

Create a Fact Table

Now let's create a fact table.

  1. Create a new Stage Node from the ORDERS Source Node
  2. Open the new STG_ORDERS node and delete all the columns except for O_ORDERKEY, O_CUSTKEY, and O_TOTALPRICE
  3. Select the DIM_CUSTOMER node on the left side, then select DIM_CUSTOMER_KEY and drag it into your STG_ORDERS mapping grid
`DIM_CUSTOMER_KEY` added to the mapping grid`DIM_CUSTOMER_KEY` added to the mapping grid

DIM_CUSTOMER_KEY added to the mapping grid

  1. Go to Join in STG_ORDERS
  2. Delete the existing text
  3. Click Generate Join and then Copy to Editor
  4. Replace the /*COLUMN*/ text with O_CUSTKEY
  5. Create and Run the STG_ORDERS node
  6. Create a Fact Node from STG_ORDERS
  7. Open the new FCT_ORDERS node
  8. Open Options Business Key add O_ORDERKEY
  9. Create and Run the FCT_ORDERS node

You have now made a fact table! You can run this query in Snowflake (adjusting your schema and databases from MY_DB and MY_SCHEMA to the ones in your environment) to confirm.

select DIM.C_NAME CUSTOMER_NAME,
sum(FCT.O_TOTALPRICE) TOTAL_PRICE
from "MY_DB"."MY_SCHEMA"."FCT_ORDERS" FCT
inner join "MY_DB"."MY_SCHEMA"."DIM_CUSTOMER" DIM
on FCT.DIM_CUSTOMER_KEY = DIM.DIM_CUSTOMER_KEY
group by DIM.C_NAME;

Did this page help you?