Refresh Your Pipeline
Refresh operations update your data by executing the transformations defined in your pipeline. A refresh can update your entire pipeline or specific Jobs that you've created to manage subsets of your data.
What Is Refresh?
Refresh is responsible for running the data transformations defined in your data warehouse metadata. This typically involves Data Manipulation Language (DML) SQL statements such as MERGE, INSERT, UPDATE, and TRUNCATE which will perform transformations on the actual data. Use refresh when you want to update your pipeline with any new changes from your data warehouse.
What Are Jobs?
Jobs are a subset of Nodes, created by the selector query, that are run during a refresh. To refresh only specific parts of your pipeline, create, and use Jobs.
Jobs can only be run if they have been deployed first. Review our Deployment Overview to learn different ways to deploy your pipeline.
Refresh Methods
The Coalesce Scheduler lets you automate refresh operations directly in the application, making it easier to maintain regular data updates without external tools. You can also refresh your pipeline using:
- The Coalesce Scheduler
- The Coalesce App. Only existing, deployed Jobs can be run from the Coalesce App.
- CLI
- Jobs API
- Third-Party Scheduling Tools
Types of Jobs
Coalesce offers three ways to refresh your data pipeline: pre-configured Jobs with specific IDs, ad-hoc Jobs for manual execution, and full pipeline refreshes.
Name | Job ID | Method | Description |
---|---|---|---|
Jobs | Yes | API, CLI, Coalesce Scheduler, Coalesce App | Any Jobs you created in the Coalesce app on the Build page. They have a Job ID and are started using the Coalesce Scheduler, API, Coalesce App, or CLI. |
Ad-Hoc | None | API or CLI | Jobs that run manually using the API or CLI. They use include and exclude syntax. They aren't created in the app and can be run in addition to existing Jobs. These are standard within Coalesce and can't be removed from the Deploy page. |
Refreshed All Jobs | None | API or CLI | Refresh all the nodes in your pipeline. They don't use include or exclude syntax. They aren't created in the app and can be run in addition to existing Jobs. These are standard within Coalesce and can't be removed from the Deploy page. |
Steps to Refresh or Run Jobs
Deploy and refresh jobs are triggered for individual environments. Once an environment has been deployed, it can be refreshed using the Scheduler, API, or CLI.
- Create your Jobs
- Configure your Environment
- Configure your Git Integration
- Set your Parameters (optional). You can set them on the environment level or during the deploy processes.
- Refresh your pipeline.
You can only refresh if you've deployed your pipeline.
📄️ Creating and Run Jobs
Execute Coalesce data pipeline jobs comprehensively using API, CLI, and web interface. Learn job creation, environment configuration, deployment prerequisites, and refresh execution methods for maintaining enterprise data warehouse refreshes and automated pipeline operations.
📄️ Refreshing Your Pipeline Using the Coalesce Scheduler
Schedule data pipeline refreshes directly within the Coalesce platform using the built-in scheduler. Learn job scheduling configuration, cron expressions, retry policies, notification setup, and automated refresh management for enterprise data transformation workflows without external dependencies
📄️ Managing Jobs
Manage Coalesce data pipeline jobs including editing, monitoring, scheduling, and rerunning failed jobs. Master job lifecycle management, status tracking, failure recovery, and job scheduling for efficient enterprise data transformation workflows and pipeline maintenance.