Skip to main content

SQL Server

This page explains how to configure a SQL Server integration in Catalog. It focuses on network access, permissions, and validation steps required to allow Catalog to extract metadata in a read-only and secure way.

Prerequisites

To create a SQL Server login and database user, you must connect using an account with one of these permission sets:

  • sysadmin
  • securityadmin at the server level and db_owner on each target database

Permissions Required by the Catalog User

The Catalog integration requires read-only access to SQL Server metadata. The Catalog user must not have write or administrative permissions.

Server-Level Permissions

Grant the following permission to allow the Catalog user to discover databases through sys.databases:

GRANT VIEW ANY DATABASE TO <login_name>;

Database-Level Permissions

Repeat the following steps for each database Catalog should extract metadata from.

First, switch to the target database:

USE <database_name>;
GO

Then grant the required permissions:

  • CONNECT: Allows connection to the database.
  • VIEW DEFINITION: Enables reading schemas, tables, columns, views, and view definitions.
  • db_datareader: Provides read access to metadata from system and information schema views.
GRANT CONNECT TO <user_name>;
GRANT VIEW DEFINITION TO <user_name>;
ALTER ROLE db_datareader ADD MEMBER <user_name>;

Query Store Access for Query Lineage

If Query Store extraction is enabled, grant the following permission to allow access to sys.query_store_* system views:

GRANT VIEW DATABASE STATE TO <user_name>;

Schema Usage for Restricted Environments

If your environment restricts schema access, grant USAGE on required schemas:

GRANT USAGE ON SCHEMA::dbo TO <user_name>;
-- Repeat for other schemas as needed

Example: Create Catalog Login and User

-- Create login
CREATE LOGIN catalog_login WITH PASSWORD = 'StrongPasswordHere!';
GO

-- Create user in a target database
USE <database_name>;
GO
CREATE USER catalog_user FOR LOGIN catalog_login;
GO

-- Grant required permissions
GRANT CONNECT TO catalog_user;
GRANT VIEW DEFINITION TO catalog_user;
ALTER ROLE db_datareader ADD MEMBER catalog_user;
GRANT VIEW DATABASE STATE TO catalog_user; -- optional

How Catalog Discovers Databases

By default, Catalog runs in multi-database discovery mode. It connects to SQL Server with the provided credentials, lists the databases visible to the user, optionally filters them using db_allowed or db_blocked settings, and extracts metadata from each selected database.

Database Discovery Query
SELECT
db.database_id,
database_name = db.name
FROM
sys.databases AS db
WHERE
db.name NOT IN ('master', 'model', 'msdb', 'tempdb', 'DBAdmin')

1. Whitelist Catalog's IP on SQL Server

To allow Catalog to connect to SQL Server, you need to whitelist Catalog's IP address.

Here are the fixed IPs:

SQL Server Connector

Catalog's SQL Server connector executes SQL queries directly against your SQL Server databases without an HTTP proxy. Clients frequently provide Catalog access to a replica or read-only instance.

2. Create Catalog User on SQL Server

First, create the server-level login:

Server-Level Login
CREATE LOGIN <login_name> WITH PASSWORD = '*****';
GO

Repeat for Each Database

Then repeat this sequence for each database you want Catalog to extract:

Database-Level User
USE <database>;
GO

CREATE USER <username> FOR LOGIN <login_name>;
GO

GRANT CONNECT TO <username>;
GRANT VIEW DEFINITION TO <username>;
GO

3. Enable Query Store Extraction - Optional

This step is optional. If you want Catalog to extract SQL query history, you must enable Query Store on all databases that Catalog has access to. Query history is necessary to compute lineage and show table source queries.

Follow the Microsoft documentation on Query Store.

Once Query Store is enabled, you must also give the VIEW DATABASE STATE permission to the user on each database:

Database-Level Permissions
USE <database>;
GO

GRANT VIEW DATABASE STATE TO <username>;

Catalog reads from Query Store system views (sys.query_store_*) to extract SQL queries. In most environments, VIEW DATABASE STATE is sufficient to allow this access.

4. Check Credentials

Before providing credentials to Catalog, validate the setup manually. Connect using the Catalog credentials and run the following queries.

List Databases Visible to the Catalog User
SELECT
db.database_id,
database_name = db.name
FROM
sys.databases AS db
WHERE
db.name NOT IN ('master', 'model', 'msdb', 'tempdb', 'DBAdmin')

Repeat Credentials Check for Each Database

For each database you want Catalog to extract, connect to that database and run the checks below:

USE <database>;
GO

List Schemas Visible to the Catalog User

WITH ids AS (
SELECT DISTINCT
table_catalog,
table_schema
FROM information_schema.tables
WHERE table_catalog = DB_NAME()
)
SELECT
d.database_id,
database_name = i.table_catalog,
schema_name = s.name,
s.schema_id,
schema_owner = u.name,
schema_owner_id = u.uid
FROM sys.schemas AS s
INNER JOIN ids AS i
ON s.name = i.table_schema
LEFT JOIN sys.sysusers AS u
ON s.principal_id = u.uid
LEFT JOIN sys.databases AS d
ON i.table_catalog = d.name;

Check Query Store Access

This step is optional.

SELECT
q.query_id,
qt.query_sql_text AS query_text,
rs.count_executions,
rs.last_duration AS duration,
rs.last_execution_time AS start_time,
DATEADD(
SECOND,
last_duration / 1000000,
DATEADD(MICROSECOND, last_duration % 1000000, rs.last_execution_time)
) AS end_time
FROM
sys.query_store_runtime_stats AS rs
INNER JOIN
sys.query_store_plan p
ON rs.plan_id = p.plan_id
INNER JOIN
sys.query_store_query q
ON p.query_id = q.query_id
INNER JOIN
sys.query_store_query_text qt
ON q.query_text_id = qt.query_text_id;

5. Add New Credentials in Catalog App

Enter the credentials in the Catalog App. You must be a Catalog admin.

Go to Settings > Integrations > SQL Server > Add.

Credentials format:

{
"port": 1234,
"host": "host",
"user": "user",
"password": "*****",
"default_db": "database"
}

The default_db field is optional and only needed in specific cases. See the section below.

Default Database

In some environments, notably Azure SQL, the login may not have access to master. If SQL Server attempts to connect to master by default, you may see:

The server principal "<user>" is not able to access the database "master"
under the current security context

In that case, set default_db to a database that the user can access. This is used to bootstrap the connection and doesn't restrict extraction by itself.

Appendix

This section provides reference information for the SQL Server integration.

Queries Run by Catalog

This section covers the queries Catalog runs on SQL Server. Catalog only extracts metadata such as columns, schemas, tables, users, and view definitions. Optionally, it extracts Query Store query history.

Databases

Databases
SELECT
db.database_id,
database_name = db.name
FROM sys.databases AS db
WHERE db.name NOT IN ('master', 'model', 'msdb', 'tempdb', 'DBAdmin')

Schemas

Schemas
-- Fetch database information
WITH ids AS (
SELECT DISTINCT
table_catalog,
table_schema
FROM {database}.information_schema.tables
)

SELECT
d.database_id,
database_name = i.table_catalog,
schema_name = s.name,
schema_id = CAST(d.database_id AS VARCHAR(10)) + '_' + CAST(s.schema_id AS VARCHAR(10)),
schema_owner = u.name,
schema_owner_id = u.uid
FROM {database}.sys.schemas AS s
INNER JOIN ids AS i
ON s.name = i.table_schema
LEFT JOIN {database}.sys.sysusers AS u
ON s.principal_id = u.uid
LEFT JOIN {database}.sys.databases AS d
ON i.table_catalog = d.name

Tables

Tables
/*
Select all types of tables:
- Views
- Base Tables
- External Tables
*/
WITH extended_tables AS (
SELECT
table_id = object_id,
table_name = name,
table_owner_id = principal_id,
schema_id
FROM
sys.tables

UNION

SELECT
table_id = object_id,
table_name = name,
table_owner_id = principal_id,
schema_id
FROM
sys.views

UNION

SELECT
table_id = object_id,
table_name = name,
table_owner_id = principal_id,
schema_id
FROM
sys.external_tables
),
-- Get the row count per table
partitions AS (
SELECT
object_id,
row_count = SUM(rows)
FROM sys.partitions
GROUP BY object_id
),
-- Append row count to table properties
extended_tables_with_row_count AS (
SELECT
et.*,
row_count
FROM
extended_tables AS et
LEFT JOIN partitions AS p
ON et.table_id = p.object_id

),
-- Generate table identifiers and fetch table description
table_ids AS (
SELECT
table_id,
table_name,
schema_name = ss.name,
schema_id = ss.schema_id,
table_owner_id,
table_owner = u.name,
row_count,
comment = CONVERT(varchar(1024), ep.value)
FROM
extended_tables_with_row_count AS et
LEFT JOIN sys.schemas AS ss
ON et.schema_id = ss.schema_id
LEFT JOIN sys.sysusers AS u
ON et.table_owner_id = u.uid
LEFT JOIN sys.extended_properties AS ep
ON (
et.table_id = ep.major_id
AND ep.minor_id = 0
AND ep.name = 'MS_Description'
)
),

meta AS (
SELECT
database_name = t.table_catalog,
database_id = db.database_id,
schema_name = t.table_schema,
t.table_name,
t.table_type
FROM
information_schema.tables AS t
LEFT JOIN sys.databases AS db
ON t.table_catalog = db.name
WHERE t.table_catalog = db_name()
)

SELECT
m.database_name,
m.database_id,
m.schema_name,
i.schema_id,
m.table_name,
i.table_id,
m.table_type,
i.table_owner,
i.table_owner_id,
i.comment,
tuples = i.row_count
FROM
meta AS m
LEFT JOIN table_ids AS i
ON (m.table_name = i.table_name AND m.schema_name = i.schema_name)

Columns

Columns
/*
Select all types of tables:
- Views
- Base Tables
- External Tables
*/
WITH extended_tables AS (
SELECT
table_id = object_id,
table_name = name,
table_owner_id = principal_id,
schema_id
FROM
sys.tables

UNION

SELECT
table_id = object_id,
table_name = name,
table_owner_id = principal_id,
schema_id
FROM
sys.views

UNION

SELECT
table_id = object_id,
table_name = name,
table_owner_id = principal_id,
schema_id
FROM
sys.external_tables
),

-- Create the column identifiers
column_ids AS (
SELECT
sd.database_id,
database_name = sd.name,
column_id = sc.column_id,
column_name = sc.name,
table_id,
table_name,
schema_name = ss.name,
schema_id = ss.schema_id,
comment = CONVERT(varchar(1024), ep.value)
FROM sys.columns AS sc
LEFT JOIN extended_tables AS et ON sc.object_id = et.table_id
LEFT JOIN sys.schemas AS ss ON et.schema_id = ss.schema_id
LEFT JOIN sys.databases AS sd ON sd.name = DB_NAME()
LEFT JOIN sys.extended_properties AS ep
ON
sc.object_id = ep.major_id
AND sc.column_id = ep.minor_id
AND ep.name = 'MS_Description'
),

columns AS (
SELECT
i.database_name,
i.database_id,
schema_name = c.table_schema,
i.schema_id,
table_name = c.table_name,
i.table_id,
c.column_name,
c.data_type,
c.ordinal_position,
c.column_default,
c.is_nullable,
c.character_maximum_length,
c.character_octet_length,
c.numeric_precision,
c.numeric_precision_radix,
c.numeric_scale,
c.datetime_precision,
i.comment,
column_id = CONCAT(i.table_id, '.', c.column_name)
FROM
information_schema.columns AS c
LEFT JOIN column_ids AS i
ON
(
c.table_name = i.table_name
AND c.table_schema = i.schema_name
AND c.column_name = i.column_name
)
)

SELECT * FROM columns

Users

Users
SELECT
user_name = u.name,
user_id = u.principal_id
FROM sys.database_principals AS u

View DDLs

View DDLs
SELECT
v.name AS view_name,
m.definition AS view_definition,
s.name AS schema_name,
DB_NAME() AS database_name
FROM
sys.views v
INNER JOIN
sys.schemas s
ON v.schema_id = s.schema_id
INNER JOIN
sys.sql_modules m
ON v.object_id = m.object_id

Queries

Queries
SELECT
q.query_id,
qt.query_sql_text as query_text,
rs.count_executions,
rs.last_duration as duration,
rs.last_execution_time as start_time,
'unknown-user' as user_name,
'unknown-user' as user_id,
DATEADD(SECOND, last_duration / 1000000,
DATEADD(MICROSECOND, last_duration % 1000000, rs.last_execution_time)
) AS end_time
FROM
sys.query_store_runtime_stats AS rs
INNER JOIN
sys.query_store_plan p
ON rs.plan_id = p.plan_id
INNER JOIN
sys.query_store_query q
ON p.query_id = q.query_id
INNER JOIN
sys.query_store_query_text qt
ON q.query_text_id = qt.query_text_id
WHERE
CAST(rs.last_execution_time AS DATE) = :day
AND DATEPART(HOUR, rs.last_execution_time) BETWEEN :hour_min AND :hour_max