Tableau

Integrate Tableau with the Catalog to sync dashboards, data sources, and metadata.

Requirements

Before you connect Tableau, make sure the following are in place. You need a Warehouse type integration configured to complete the first ingestion of this integration.

For all Tableau clients:

Tableau admin credentials

For Tableau Server clients only:

Have the Metadata API enabled. View how to enable the Tableau Metadata API.
Have the API gateway. By default it is the URL you use to access Tableau.
Use Tableau Server version 2019.3 or later. Older versions won't have lineage to tables available.

For Tableau on-premises, both Catalog managed and client managed work.

If you have MFA enabled for your account, you'll need to use the Tableau personal access token PAT option.

Allowlist Catalog IP

Add the following IPs to your allowlist:

For instances on app.us.castordoc.com: 34.42.92.72
For instances on app.castordoc.com: 35.246.176.138

Self-Hosted Access

Self-hosted repositories must be accessible from our IP over the public internet for Catalog managed integrations.

Catalog Managed

You can upload your credentials directly in the Coalesce App when creating your Tableau integration.

We need a Tableau PAT:

{
  "serverUrl": "https://something.tableau.com/",
  "siteId": "something",
  "tokenName": "catalog",
  "token": "abcdefgh"
}

serverUrl: Tableau base URL, your API endpoint, usually your Tableau URL homepage. For example: <https://eu-west-1a.online.tableau.com> siteId: The Tableau Server site you're authenticating with. For example, in the site URL http://MyServer/#/site/MarketingTeam/projects, the site name is MarketingTeam. To connect with the Default site on the server, set siteId to "Default".

For your first sync, it'll take up to 48 hours and we'll let you know when it's complete.

If you aren't comfortable giving us access to your credentials, continue to Client managed below.

Client Managed

If you prefer to manage extraction yourself, you can run the Catalog extraction package on your own infrastructure.

Doing a One Shot Extract

For your trial, you can give us a one shot view of your BI tool.

To get things working quickly, here's a Google Colab to run our package.

Running the Extraction Package

Install the extractor, run it against your Tableau instance, and then upload the results to Catalog.

Install the PyPI Package

pip install castor-extractor[tableau]

For further details, see the castor-extractor PyPI page.

Run the PyPI Package

Once the package has been installed, you can run the following command in your terminal:

castor-extract-tableau [arguments]

The script will run and display logs as follows:

INFO - Logging in using user and password authentication
INFO - Signed into https://eu-west-1a.online.tableau.com as user with id ****
INFO - Extracting USER from Tableau API
INFO - Fetching USER
INFO - Querying all users on site

...

INFO - Wrote output file: /tmp/catalog/1649078755-custom_sql_queries.json
INFO - Wrote output file: /tmp/catalog/1649078755-summary.json

Credentials

You can sign in using the following method:

Tableau personal access token, PAT
- -n, --token-name: Tableau token name
- -t, --token: Tableau token

Other Arguments

-b, --server-url: Tableau base URL, your API endpoint, usually your Tableau URL homepage. For example: <https://eu-west-1a.online.tableau.com>
-i, --site-id: Tableau Site ID, is empty if your site is the default one
-o, --output: target folder to store the extracted files

You can also get help with --help.

Scheduling and Push to Catalog

When moving out of trial, you'll want to refresh your Tableau content in the Catalog. Here is how to do it:

Your source ID provided by Catalog, referred to as source_id in the code examples
Your Catalog Token given by Catalog

We recommend using the castor-upload command:

castor-upload [arguments]

The Catalog team will provide you with:

Catalog Identifier: an ID to match your Tableau files with your Catalog instance
Catalog Token: an API token

You can then use the castor-upload command:

castor-upload [arguments]

Arguments

-k, --token: Token provided by Catalog
-s, --source_id: account id provided by Catalog
-t, --file_type: source type to upload. Currently supported are 0

Target Files

To specify the target files, provide one of the following:

-f, --file_path: to push a single file
-d, --directory_path: to push several files at once

The tool uploads all files in the given directory. Make sure it contains only the extracted files before pushing.

Then you'll have to schedule the script run and the push to the Catalog. Use your preferred scheduler to create this job.

Troubleshooting

These sections cover common issues with Tableau sign-in, lineage, and extraction.

The Catalog Tableau ingestion signs in to your Tableau Server or Tableau Cloud instance using the Tableau REST API. If credentials, site details, or network access are wrong, extraction can fail before any metadata is collected. Logs from the extraction path might reference the Tableau Server Client for Python and show a sign-in failure such as FailedSignInError with 401001: Signin Error. Treat that pattern as a Tableau authentication or connectivity problem for this integration, then work through the checks below.

Match Errors to the Right System

Match the failing integration and product in the log line. Numeric codes alone are not enough to tell which system returned an error.

Use this flow when sign-in fails for Catalog-managed or client-managed extraction:

Confirm credentials outside Catalog - Sign in to Tableau with the same account, PAT name, and token value your integration uses, or run a small REST sign-in test your security team approves. If sign-in fails there, fix the account or token in Tableau before you change anything in Catalog.
Refresh or replace PAT values - If you use a PAT, create a new token in Tableau when the old one expires, is revoked, or was rotated without updating Catalog. Update tokenName and token together so they match the new token in Tableau.
Check serverUrl and siteId - serverUrl must be the HTTPS base URL you use to reach Tableau, with the correct hostname and region for Tableau Cloud, or the correct server host for Tableau Server. Avoid typos, mixing http and https, and extra characters at the end of the URL unless your Tableau admin documents a specific format. siteId must be the site name from the Tableau URL. For the default site, use Default as described in Catalog Managed.
Verify allowlisted egress - Confirm the Catalog egress IPs in Allowlist Catalog IP are allowlisted on your side if traffic is restricted, so Catalog can reach your Tableau sign-in endpoint.
Client-managed extractor - If you run castor-extract-tableau on your infrastructure, PAT, serverUrl, and siteId rules are the same. Password-based sign-in can fail after a password change until you update stored credentials or environment variables. See the castor-extractor Tableau package documentation for flags and environment variables.
Retry ingestion - After you update credentials or network rules, run or schedule extraction again.

If sign-in still fails after these steps, contact your Catalog point of contact with the exact error text, including the sign-in failure line, and the time stamp of the failed run. For REST error reference material from Tableau, see Tableau REST API.

Lineage Is Missing or Inconsistent After Metadata Refresh

After a Tableau metadata refresh, dashboard metadata and field metadata may update correctly, while lineage is incomplete or inconsistent for assets in the same workbook or site. Metadata extraction and lineage processing happen in separate steps, so they don't always finish at the same time.

If lineage is missing for some assets after a refresh:

Wait for the next scheduled sync cycle to complete. Lineage processing can lag behind metadata extraction.
Make sure the Metadata API is enabled on your Tableau Server. If you haven't set it up, follow the steps to enable the Tableau Metadata API.
Confirm that your Tableau Server version is 2019.3 or later. Older versions don't support table-level lineage.
Check that the serverUrl and siteId in your credentials match the site where the affected assets are published.
If the issue persists after a full sync cycle, reach out to your Catalog point of contact with the affected workbook URLs and the time stamp of the last successful sync.

Client-Managed Extraction Timeouts

Client-managed Tableau extractions can time out when processing a large number of assets or when the Tableau server responds slowly. Symptoms include DAG failures, incomplete extraction logs, or missing metadata in Catalog after a scheduled run.

To resolve extraction timeouts:

Configure separate extraction runs for different databases or schemas instead of extracting everything in a single run.
Use --skip-columns and --skip-fields to reduce the extraction payload if you don't need column-level or field-level metadata.
Stagger extraction schedules so multiple integrations don't run at the same time.
Check the extractor logs for specific error messages and time stamps.
Confirm that your PAT is valid and that the Metadata API is healthy on your Tableau Server.
If timeouts continue, reach out to your Catalog point of contact with the failure time stamps and error messages from the extractor logs.

Requirements​

Allowlist Catalog IP​

Catalog Managed​

Client Managed​

Doing a One Shot Extract​

Running the Extraction Package​

Install the PyPI Package​

Run the PyPI Package​

Credentials​

Other Arguments​

Scheduling and Push to Catalog​

Arguments​

Target Files​

Troubleshooting​

Sign-In or Authentication Errors​

Match Errors to the Right System​

Lineage Is Missing or Inconsistent After Metadata Refresh​

Client-Managed Extraction Timeouts​