Sigma
Connect Catalog to Sigma to sync workbooks, data sets, Data Models, and related metadata. You can use Catalog-managed credentials or run the extraction package yourself. Sigma's APIs control which objects and lineage details Catalog can return, so some graphs look complete while others stay object-level only.
Requirements
Before you connect Sigma to Catalog, confirm that you meet the Sigma access requirements and warehouse prerequisites in this section.
A warehouse integration must already be configured to complete the first ingestion of this integration.
You also need the following:
- You must be an Org Admin to create a Sigma API token and Client ID.
- Anyone with access to an active Sigma API token and Client ID can authenticate with Sigma's API at the access level associated with the token.
Catalog-Managed
To get started with Sigma in Catalog, you need to be a Sigma admin and provide:
- A Sigma API token from Sigma. Treat it like any other secret credential.
- Your Client ID
- The Host URL of your instance
For more information on how to retrieve the API token and Client ID, see Get an API Token and Client ID.
To get the correct Host URL of your instance, see Identify your API request URL.
Input your credentials directly into your Catalog account in the following format:
{
"apiToken": "*****",
"clientId": "*****",
"host": "https://<your_provider_url>.sigmacomputing.com"
}
For further details on the Sigma API, see Get Started with Sigma's API.
For your first sync, it will take up to 48 hours and we will let you know when it is complete.
If you are not comfortable giving us access to your credentials, continue to Client-Managed.
Client-Managed
Doing a Time Extract
For your trial, you can give us a one time view of your BI tool.
To get things working quickly, use this Google Colab to run the package.
Running the Extraction Package
Install the PyPI Package
For further details on the Catalog Extractor PyPI package, see castor-extractor on PyPI.
Running the PyPI Package
Once the package has been installed, run the following command in your terminal:
castor-extract-sigma [arguments]
The script will run and display logs similar to:
INFO - Extracting users from Sigma API
INFO - POST(https://cloud.sigma.com/api/4.0/login)
INFO - GET(https://cloud.sigma.com/api/4.0/users/search)
INFO - Fetched page 1 / 7 results
INFO - GET(https://catalog.cloud.sigma.com/api/4.0/users/search)
INFO - Fetched page 2 / 0 results
...
INFO - Wrote output file: /tmp/catalog/1649079699-projects.json
INFO - Wrote output file: /tmp/catalog/1649079699-summary.json
Extraction Arguments
-H,--host: Sigma host-c,--client-id: Sigma client ID-a,--api-token: Generated API key-o,--output: Directory to write to
Scheduling and Push to Catalog
When moving out of trial, you will want to refresh your Sigma content in Catalog. Here is how to do it:
The Catalog team will provide you with:
- A Catalog Identifier we use to match your Sigma files with your Catalog instance.
- A Catalog Token, an API token you use when uploading to Catalog.
You can then use the castor-upload command:
castor-upload [arguments]
Upload Arguments
-k,--token: Token provided by Catalog-s,--source_id: Account ID provided by Catalog-t,--file_type: Source type to upload. Currently supported:DBT,VIZ, orWAREHOUSE
Target Files
To specify the target files, provide exactly one of the following:
-f,--file_path: To push a single file
or
-d,--directory_path: To push several files at once
The tool will upload every file included in the given directory. Make sure it contains only the extracted files before pushing.
Then schedule the script run and the push to Catalog using your preferred scheduler.
Known Limitations
Catalog reads Sigma metadata and lineage through Sigma's public APIs. What you see in Catalog therefore depends on what those APIs expose, their rate limits, and how Sigma models dashboards, fields, and dependencies for your tenant.
Use this section to interpret gaps that are tied to Sigma's API rather than Catalog configuration alone:
- Dashboard coverage - Some dashboards or nested dashboard content might not extract with the same depth as Sigma workbooks and data sets you see elsewhere in Catalog.
- Field-level lineage - Column-to-field lineage between warehouse columns and Sigma fields can be incomplete or absent when Sigma does not expose the relationships Catalog needs at field granularity. Asset-level lineage is still the most reliable view for many tenants. It reflects which Sigma objects connect to which warehouse tables or other Sigma assets.
- API limits - Aggressive extraction schedules or very large Sigma estates can encounter throttling responses from Sigma. Catalog retries within normal extraction runs, but a sync can take longer or need another run before every object appears.
If extraction finishes successfully and credentials are correct, missing pieces in lineage or dashboards are often consistent with Sigma's API surface rather than an error in Catalog. For how asset-level lineage differs from field-level lineage inside Catalog, see Lineage.
Sigma Data Models
Sigma Data Models are a Sigma asset type that Sigma treats as part of its semantic modeling layer so teams can reuse and govern modeled data. Catalog's Sigma connector treats Data Models as first-class Sigma assets alongside workbooks and data sets during extraction.
You'll see Data Models reflected in Catalog's Sigma visualization content the same way as other synced Sigma assets, including when you search Catalog or inspect lineage graphs that involve those models.
Setup stays the same: configure your warehouse integration, supply a valid Sigma API token, Client ID, and Host URL using Catalog-managed or client-managed extraction, then run sync on your usual cadence. You do not use a separate integration or credential block just for Data Models.
For lineage specifically:
- Expect asset-level relationships wherever Sigma's APIs describe how a Data Model connects to upstream data sets, modeled tables, or warehouse objects Catalog already knows about through your integrations.
- Field-level lineage builds only from details Sigma publishes for each metric and dimension. Dimensions and metrics on a Data Model can still lack column-to-field edges when those details are unavailable, consistent with Known Limitations. Catalog records what Sigma returns.
Together, workbook, data set, and Data Model metadata give you governance and reuse context inside Catalog even when finer-grained field edges are sparse.
What's Next?
- Browse other integrations on the BI Tools overview.
- Learn how Catalog represents asset-level versus field-level lineage in Lineage.