Skip to main content

Databricks

Unity Catalog and Hive

This integration covers both Unity Catalog and non-Unity Catalog (Hive) Databricks.

Per Workspace

This integration is to be replicated for each workspace you want to integrate with Catalog.

Requirements

  • You must be a Databricks administrator of the workspace to integrate.

1. Create a Personal Access Token

Token Requirements

Personal Access Tokens are related to an account.

Create a Token on an account with Databricks SQL access.

If you already have a generic account with Databricks SQL access, you can go to Create the token.

If you want to create a dedicated user for Catalog, see how to manage users:

Create the Token

You can create a token as follows:

  1. In the top right menu containing the account mail address, go to User settings.
  2. In the default Access tokens tab, Generate new token.
  3. Name your token and set Lifetime (days) parameter to empty.
  4. Create a JSON file with the following format.

2. Retrieve Your host

Your host or instance name can be found in your Databricks URL: https://<instance-name>.cloud.databricks.com or https://<instance-name>/.

3. Retrieve Your http_path

Your http_path can be found with the following steps. For more details, see the Databricks JDBC connection documentation.

  1. Log in to your Databricks workspace.
  2. In the sidebar, click Compute.
  3. Choose a cluster to connect to.
  4. Navigate to Advanced Options.
  5. Click on the JDBC/ODBC tab.
  6. Copy the http_path.

4. Enable the system_tables

To enable the system_tables, you can follow the Databricks system tables documentation.

5. Credentials

Your credentials should look like the following:

{
"token": "<your-token>",
"host": "<your-server-hostname>",
"http_path": "sql/protocolv1/o/xxxxxx"
}

Add User Connection Info Into Catalog

Paste your JSON file on your integration page.

  1. Add Databricks
Add Databricks integration
  1. Pick Catalog Managed and name your integration (should you have several ones).
  2. Paste your JSON in the Credentials window.
Databricks credentials window

For your first sync, it will take up to 48 h and we will let you know when it is complete.