Understanding the Presync Process
When you deploy tables and views in Coalesce, a background process called presync runs automatically. Presync helps prevent deployment failures caused by changes made outside of Coalesce. This guide explains what presync is, why it matters, and what you’ll see when it runs.
The Problem Presync Solves
Deployments can fail if objects are changed or deleted directly in your data platform. For example, you might deploy a table called CUSTOMER_DATA
through Coalesce. Later, someone drops that table directly in the warehouse. When you try to redeploy changes, Coalesce attempts to modify a table that no longer exists, causing the deployment to fail.
Presync solves this by keeping Coalesce synchronized with the current state of your data platform. Common off-platform changes include:
- Tables or views being dropped directly in the warehouse
- Schema changes made through other tools
- Objects renamed or moved to different databases
- Entire databases or schemas deleted
How Presync Works
Presync compares three sources of truth:
- What Coalesce has stored from your last successful deployment.
- What you’re trying to deploy.
- What actually exists in your data platform.
When differences are found, presync decides how to reconcile them to prevent deployment failures.
Phase 1: Discovery and Validation
Presync categorizes your changes:
- New tables or views being added
- Existing tables or views being modified
- Tables or views being removed
It then checks that all referenced databases and schemas exist. If any are missing, presync filters out the affected nodes.
Phase 2: Conflict Resolution
For each object, presync compares intent with reality:
- New objects
- If the object doesn’t exist: deployment continues normally.
- If it already exists with the correct structure: presync adopts it.
- If it exists with differences: presync updates Coalesce’s understanding to match.
- Modified objects
- If it exists as expected: presync compares columns and updates details.
- If it’s missing: presync creates it.
- If it was moved or changed type: presync updates Coalesce to reflect the actual state.
Phase 3: Column Synchronization
Presync reviews detailed table and view structures, including:
- Column names, types, and properties
- Missing columns to be added
- Unexpected columns that are preserved
- Changed column properties to be updated
Platform-Specific Validations
On Databricks, presync validates that Delta Lake tables include required properties for schema evolution. If the property delta.columnMapping.mode=name
is missing, presync halts deployment and reports which tables must be fixed.
What You’ll See During Deployment
Normal Operation
You’ll see a “Making metadata updates” step when presync runs. In most cases, this step completes quietly and deployment continues.
When Presync Adjusts Objects
If presync detects differences, you’ll see log messages such as:
- Object unexpectedly exists at location - Presync adopted an object it didn’t expect.
- Object missing from expected location - Presync recreated an object.
- Column differences detected - Presync found unexpected column properties.
- Location no longer exists - A database or schema was deleted.
When Presync Stops Deployment
Deployment is halted if:
- Required platform properties are missing
- Structural conflicts can’t be resolved automatically
Example Scenarios
Adopting an Existing Table
You add a new node for SALES_SUMMARY
, but the same table already exists with the correct structure. Presync adopts the table instead of recreating it, preserving your data.
Handling a Moved Table
You move CUSTOMER_DATA
from staging
to prod
, but another CUSTOMER_DATA
already exists in prod
. Presync compares structures and either adopts the existing table or reports conflicts.
Database Cleanup
If a referenced database has been deleted, presync detects the missing location and treats those nodes as new objects, preventing deployment errors.
Benefits of Presync
- Prevents deployment failures caused by off-platform changes
- Preserves data whenever possible
- Provides detailed visibility into external changes
- Makes intelligent decisions automatically
- Validates platform requirements before deployment continues
Best Practices
- Review presync logs if a deployment behaves unexpectedly
- Limit off-platform changes when possible
- Remember that presync adopts existing objects rather than overwriting them
- On Databricks, ensure Delta Lake tables include
delta.columnMapping.mode=name
When To Investigate Further
Contact your Coalesce administrator if you see:
- Repeated warnings about unexpected object states
- Deployment failures tied to missing platform requirements
- Presync decisions that don’t align with your expectations
Presync is designed to make deployments more reliable by automatically reconciling differences between Coalesce and your data platform.