Tableau
Integrate Tableau with the Catalog to sync dashboards, data sources, and metadata.
Requirements
A Warehouse type integration must already be configured to complete the first ingestion of this integration.
For all Tableau clients:
- Tableau admin credentials
For Tableau Server clients only:
- Have the Metadata API enabled. View how to enable the Tableau Metadata API.
- Have the API gateway. By default it is the URL you use to access Tableau.
- Use Tableau Server version 2019.3 or later. Older versions won't have lineage to tables available.
For Tableau on-premises, both Catalog managed and client managed work.
If you have MFA enabled for your account, you'll need to use the Tableau personal access token PAT option.
Allowlist Catalog IP
Add the following IPs to your allowlist:
- For instances on app.us.castordoc.com:
34.42.92.72 - For instances on app.castordoc.com:
35.246.176.138
Self-hosted repositories must be accessible from our IP over the public internet for Catalog managed integrations.
Catalog Managed
You can upload your credentials directly in the Coalesce App when creating your Tableau integration.
We need a Tableau PAT:
{
"serverUrl": "https://something.tableau.com/",
"siteId": "something",
"tokenName": "catalog",
"token": "abcdefgh"
}
serverUrl: Tableau base URL, your API endpoint, usually your Tableau URL homepage. For example: <https://eu-west-1a.online.tableau.com>
siteId: The Tableau Server site you're authenticating with. For example, in the site URL http://MyServer/#/site/MarketingTeam/projects, the site name is MarketingTeam. To connect with the Default site on the server, set siteId to "Default".
For your first sync, it'll take up to 48 hours and we'll let you know when it's complete.
If you aren't comfortable giving us access to your credentials, continue to Client managed below.
Client Managed
If you prefer to manage extraction yourself, you can run the Catalog extraction package on your own infrastructure.
Doing a One Shot Extract
For your trial, you can give us a one shot view of your BI tool.
To get things working quickly, here's a Google Colab to run our package.
Running the Extraction Package
Install the extractor, run it against your Tableau instance, and then upload the results to Catalog.
Install the PyPI Package
pip install castor-extractor[tableau]
For further details, see the castor-extractor PyPI page.
Run the PyPI Package
Once the package has been installed, you can run the following command in your terminal:
castor-extract-tableau [arguments]
The script will run and display logs as follows:
INFO - Logging in using user and password authentication
INFO - Signed into https://eu-west-1a.online.tableau.com as user with id ****
INFO - Extracting USER from Tableau API
INFO - Fetching USER
INFO - Querying all users on site
...
INFO - Wrote output file: /tmp/catalog/1649078755-custom_sql_queries.json
INFO - Wrote output file: /tmp/catalog/1649078755-summary.json
Credentials
You can sign in using the following method:
- Tableau personal access token, PAT
-n,--token-name: Tableau token name-t,--token: Tableau token
Other Arguments
-b,--server-url: Tableau base URL, your API endpoint, usually your Tableau URL homepage. For example:<https://eu-west-1a.online.tableau.com>-i,--site-id: Tableau Site ID, is empty if your site is the default one-o,--output: target folder to store the extracted files
You can also get help with --help.
Scheduling and Push to Catalog
When moving out of trial, you'll want to refresh your Tableau content in the Catalog. Here is how to do it:
- Your source ID provided by Catalog, referred to as
source_idin the code examples - Your Catalog Token given by Catalog
We recommend using the castor-upload command:
castor-upload [arguments]
The Catalog team will provide you with:
Catalog Identifier: an ID to match your Tableau files with your Catalog instanceCatalog Token: an API token
You can then use the castor-upload command:
castor-upload [arguments]
Arguments
-k,--token: Token provided by Catalog-s,--source_id: account id provided by Catalog-t,--file_type: source type to upload. Currently supported are 0
Target Files
To specify the target files, provide one of the following:
-f,--file_path: to push a single file-d,--directory_path: to push several files at once
The tool uploads all files in the given directory. Make sure it contains only the extracted files before pushing.
Then you'll have to schedule the script run and the push to the Catalog. Use your preferred scheduler to create this job.
Troubleshooting
These sections cover common issues with Tableau lineage and extraction.
Lineage Is Missing or Inconsistent After Metadata Refresh
After a Tableau metadata refresh, dashboard metadata and field metadata may update correctly, while lineage is incomplete or inconsistent for assets in the same workbook or site. Metadata extraction and lineage processing happen in separate steps, so they don't always finish at the same time.
If lineage is missing for some assets after a refresh:
- Wait for the next scheduled sync cycle to complete. Lineage processing can lag behind metadata extraction.
- Make sure the Metadata API is enabled on your Tableau Server. If you haven't set it up, follow the steps to enable the Tableau Metadata API.
- Confirm that your Tableau Server version is 2019.3 or later. Older versions don't support table-level lineage.
- Check that the
serverUrlandsiteIdin your credentials match the site where the affected assets are published. - If the issue persists after a full sync cycle, reach out to your Catalog point of contact with the affected workbook URLs and the time stamp of the last successful sync.
User-Managed Extraction Timeouts
User-managed Tableau extractions can time out when processing a large number of assets or when the Tableau server responds slowly. Symptoms include DAG failures, incomplete extraction logs, or missing metadata in Catalog after a scheduled run.
To resolve extraction timeouts:
- Configure separate extraction runs for different databases or schemas instead of extracting everything in a single run.
- Use
--skip-columnsand--skip-fieldsto reduce the extraction payload if you don't need column-level or field-level metadata. - Stagger extraction schedules so multiple integrations don't run at the same time.
- Check the extractor logs for specific error messages and time stamps.
- Confirm that your PAT is valid and that the Metadata API is healthy on your Tableau Server.
- If timeouts continue, reach out to your Catalog point of contact with the failure time stamps and error messages from the extractor logs.