Skip to main content

Security

The Coalesce platform has been engineered to satisfy the security and governance requirements of enterprise customers.

These topics are intended primarily for the Coalesce administrators, IT teams and other Coalesce users interested in the architecture of the Coalesce platform.

Authentication Methods

Coalesce supports different methods of authentication that satisfy varying security and governance requirements. The authentication method used for an organization using Coalesce will vary depending on ease of use, separation of design and production environments as well as customer-specific data and security compliance requirements.

Snowflake OAuth

Coalesce supports the use of a Snowflake OAuth integration. This is the preferred method of authentication because it uses tokens instead of username / password to authenticate users to Snowflake.

When using Snowflake OAuth, Coalesce uses a periodically rotating access token to access Snowflake on behalf of users when using the UI.

Snowflake Basic (Cloud Storage)

Coalesce's basic authentication method uses a per-user username / password combination to authenticate to Snowflake. This method stores the username / password pair encrypted in an industry standard credential manager. After initial configuration, the password is never sent to the browser, and can not be retrieved by any APIs.

Snowflake KeyPair

The coa CLI supports using a key-pair to authenticate to Coalesce. One can read more about this system in Snowflake's documentation here.

Snowflake Basic (Local Storage)

For Coalesce users looking for an easy to use option that authenticates users to Snowflake without storing any credentials in Coalesce, the local storage option can be used to store Snowflake username / password in the browser.

Data Flows

At design time, the Coalesce UI is an in-browser editor that allows Coalesce users to design the directed acyclic graphs for their data transformation pipelines.

In order to support CORS proxying behavior and access for users interactively to their Snowflake instances and git provider, Coalesce uses industry standard direct proxying over HTTPS/TLS (minimum TLS 1.2, up to TLS 1.3 supported) encryption for data flows involving Snowflake and git.

Browser Git Integration

To support git interactions within a browser environment, a direct proxy is used by Coalesce that forwards git operations on behalf of customer actions submitted through the UI to the customer’s git provider. All data is encrypted via HTTPS/TLS and is never stored at rest.

When a user submits a git request from the browser, an API call is sent from the browser to the Coalesce backend (1), where it is authenticated. Upon successful authentication (2), the backend retrieves the git personal access token (PAT) for the user from the industry standard credential manager (3) in preparation for the git provider request. The backend then communicates directly over HTTPS/TLS with the git provider (4) (GitHub, BitBucket, Azure DevOps, GitLab) proxying requests (for CORS purposes) over HTTPS/TLS back to the browser (5). The communication in part 5 uses native git http protocol over HTTPS/TLS (this is the same protocol used when performing git clone with a https git repository URL).

Browser Git Integration Data Flow

All communication uses HTTPS/TLS encryption.

Browser Snowflake SQL Execution

In order to support Snowflake SQL execution from within a browser environment, a direct proxy is used by Coalesce that forwards SQL text on behalf of customer actions submitted through the Coalesce UI to the customer’s Snowflake instance. All data is encrypted via HTTPS/TLS and data results from SQL executions are never stored by Coalesce.

When submitting a Snowflake SQL execution, the SQL text is sent by the browser to the Coalesce backend (1), the user is successfully authenticated (2), the credentials are then retrieved from the industry standard credential manager (3) and then used to establish a connection to Snowflake (4). Results are then sent back to the browser (5) over HTTPS/TLS.

When a user requests to fetch data, the same Snowflake SQL execution path is executed, and the first 100 rows of data are returned to the browser to be displayed. The data is never stored by Coalesce.

Snowflake SQL Integration Data Flow

All communication uses HTTPS/TLS encryption.

CI/CD time (Deployments and Refresh)

Browser Deployments

When using the in-browser deployment workflow, the application uses the browser git integration and the browser Snowflake SQL execution to prepare a deployment plan.

Upon creating the plan, the deployment request is sent to the backend and the deployment is executed in the Coalesce cloud. This deployment API request establishes a connection to Snowflake and a connection to Coalesce metadata.

The connection to Snowflake is used to send the SQL commands to be executed by Snowflake. The Coalesce metadata connection is used to store run results (and other execution metadata) in Coalesce's metadata store.

Command Line Interface (CLI)

The coa CLI tool can be used on-premise for teams looking to run data transformations on their own infrastructure.

The coa CLI supports basic and Snowflake key-pair authentication. The credentials are stored locally on the machine running coa.

The coa client sends Snowflake commands directly to the customer’s Snowflake instance. Because there are no CORS policies when executing a command-line, a direct connection to Snowflake can be established.

The metadata connection to Coalesce is solely used for storing run results such as execution time, success/failure and other operational data used for reporting.

Application Programming Interface (API)

Coalesce's REST API can be used to trigger deployment and refresh runs. The Coalesce Cloud will execute runs in an isolated environment.

The API supports Snowflake OAuth, Basic auth with password in API request, or Snowflake Basic with Cloud Storage with only username specified.

A Snowflake connection and metadata connection is established in the same manner as the CLI, originating from the Coalesce cloud.

Multi-Tenancy

Frontend Multi-tenancy

When editing Coalesce metadata directly from the browser, all requests are protected by Metadata Repository Security Rules to prevent unauthorized access to metadata being authored in the GUI.

Each potential access (read, write, edit) to metadata requires an associated authenticated user. This user information is then used to determine if the user is a valid member of an organization. Upon validly matching a user to an organization to which they belong, access to metadata is permitted.

All accesses without a valid user, or a user not belonging to the organization are rejected.

Backend Multi-tenancy

When submitting requests to the backend, each request comes with a token that is then used to authenticate the user. Upon successful authentication of the user, the user's organization is determined and rules enforced such that each user can only access, edit and update their own organization's information in the industry standard credential manager.

All requests without a valid user, or a user not belonging to the organization are rejected.