Quick Set Up
Get Catalog up and running in three steps: sign up, connect your warehouse, and connect your BI tool.
Sign Upβ
After your Catalog space is set up, sign up using the link below. Your Catalog admin or Coalesce contact will confirm when your space is ready.
Catalog
Connecting Your Warehouseβ
Catalog connects to your warehouse to extract metadata (table and column names, types, lineage, and so on). For full details and supported technologies, see Warehouses.
Warehouse onboarding requires a dedicated Catalog user in your warehouse and shared credentials. The dedicated account has very limited access: it can read metadata only, not your data.
The dedicated Catalog user can access your metadata (schema, column names, query logs) but cannot read or modify your data.
Step 1: Find the Right Personβ
Identify who can create a user in your warehouse. If you're not sure, ask: "Who is responsible for onboarding new analysts?" That person typically has the right permissions.
Step 2: Create a User With the Necessary Rightsβ
Have them create a warehouse user with the rights Catalog needs. Technology-specific instructions are in Warehouses.
Step 3: Share Credentials With Catalogβ
Share the credentials through the Catalog app. Go to Settings > Integrations and add your warehouse using the credentials you created.
What Happens Nextβ
Catalog tests the connection, applies settings, and runs the first sync (usually a couple of hours). You'll be notified when the first sync finishes. After that, Catalog syncs once per day with your warehouse, so table changes appear in Catalog the next day.
Connecting Your BI Toolβ
Catalog ingests metadata from your BI tools (dashboards, reports, data sources) to power discovery and lineage. For full details and supported technologies, see BI Tools.
BI tool onboarding follows one of two paths:
- Catalog managedβYou share admin credentials; Catalog handles extraction and daily syncs. No further action on your part.
- Client managedβYou run the
castor-extractorpackage locally to extract metadata and send the output files to Catalog. Catalog never accesses your BI tool directly.
Client-managed integration requires some setup on your side, but Catalog never gets access to your BI tool. Catalog only receives the metadata files you send.
Catalog Managedβ
Securely share credentials with access to your tool's API. Catalog uses those credentials to extract metadata once per day. During onboarding, you'll be notified when the first sync completes.
Client Managed: During Your Trialβ
Catalog performs a one-time load of your BI tool metadata. You run the extractor and send the output.
- Find someone in your organization with access to your tool's API.
- Have them run the package locally or duplicate the Colab Notebook to perform extraction.
- Share the output files through Slack or email.
Catalog uploads the files and notifies you when it's done.
Client Managed: After Your Trialβ
To keep your BI tool metadata in sync, you'll schedule extraction and push the output files to a GCP bucket that Catalog provides. Sync frequency is up to you, up to once per day. Catalog provides credentials, Catalog IDs, and Python scripts to push to GCP (in the castor-extractor package).
Other Integrationsβ
See Upload existing descriptions to load table and column descriptions from a CSV.
dbt
Catalog extracts existing documentation and tags from dbt. If you use dbt but have no documentation yet, you can skip this. Otherwise:
- During trialβSend your dbt manifest through Slack or email.
- After trialβSchedule a push of your manifest to a GCP bucket that Catalog provides.
Slack App
Follow the steps in the Coalesce Catalog app and have your Slack admin approve the integration.
Microsoft Teams App
Activate the connector from the Coalesce Catalog Integrations page and follow the steps. A Microsoft 365 admin is required to grant the necessary permissions.
What If a Technology Isn't Covered?β
If your warehouse or BI tool isn't listed as a supported integration, you can use the Warehouse API or Dashboard API. Catalog provides templates for the metadata format it needs; you explore what metadata you can access and format it to match.
If the scope feels large, start by filling in a few examples by hand. That lets you test Catalog's main features before scaling up. Contact the sales team for more details.
Settingsβ
Catalog manages these settings. You can view or request changes at any time.
Account Settingsβ
Account Domains
This setting controls who can create an account linked to your Catalog space. It's configured based on the email domains of people from your organization who have been in contact with Coalesce. Multiple domains are supported. Contact Coalesce to modify it.
Sign-In Strategies
By default, everyone with the right domain can use Google SSO or sign up or sign in directly in the app.
Other strategiesβOkta SSO can be set up (outside of trial). See Okta SSO setup instructions.
Limiting strategiesβSign-in can be restricted to a single strategy. Contact Coalesce to modify.
Default User Role
By default, everyone who signs up to Catalog gets the ADMINS role. Other roles are CONTRIBUTOR and VIEWER. Contact Coalesce to change this.
Warehouse Settingsβ
Table Allow or Block
By default, Catalog includes all databases and schemas in your warehouse. The only constraint is that Catalog can handle approximately 300,000 columns. Allowlisting or blocklisting at the database or schema level is possible. Contact Coalesce to configure it.
User Classifications
Catalog classifies accounts into three types: human, BI tool, and other non-human. This affects two features:
- PopularityβComputed from the number of READ queries performed by human and BI tool accounts.
- Read QueriesβOnly human read queries appear in each table's "queries" tab.
What's Next?β
- Upload existing descriptions if you have table or column docs elsewhere
- Explore Catalog to search, discover lineage, and document your data