Skip to main content

SQL Server

Extract metadata from your SQL Server databases using the Coalesce extraction script.

Prerequisites

Before you begin, complete the following:

Install Castor Extract

Follow the installation instructions on PyPi.

Create a User

Create a dedicated user to extract your metadata.

To create a SQL Server login and database user, you must connect using an account with one of these permission sets:

  • sysadmin
  • securityadmin at the server level and db_owner on each target database

Permissions Required by the Catalog User

The Catalog integration requires read-only access to SQL Server metadata. The Catalog user must not have write or administrative permissions.

Server-Level Permissions

Grant the following permission to allow the Catalog user to discover databases through sys.databases:

GRANT VIEW ANY DATABASE TO <login_name>;

Database-Level Permissions

Repeat the following steps for each database Catalog should extract metadata from.

First, switch to the target database:

USE <database_name>;
GO

Then grant the required permissions:

  • CONNECT: Allows connection to the database.
  • VIEW DEFINITION: Enables reading schemas, tables, columns, views, and view definitions.
  • db_datareader: Provides read access to metadata from system and information schema views.
GRANT CONNECT TO <user_name>;
GRANT VIEW DEFINITION TO <user_name>;
ALTER ROLE db_datareader ADD MEMBER <user_name>;

Query Store Access for Query Lineage

If Query Store extraction is enabled, grant the following permission to allow access to sys.query_store_* system views:

GRANT VIEW DATABASE STATE TO <user_name>;

Schema Usage for Restricted Environments

If your environment restricts schema access, grant USAGE on required schemas:

GRANT USAGE ON SCHEMA::dbo TO <user_name>;
-- Repeat for other schemas as needed

Example: Create Catalog Login and User

-- Create login
CREATE LOGIN catalog_login WITH PASSWORD = 'StrongPasswordHere!';
GO

-- Create user in a target database
USE <database_name>;
GO
CREATE USER catalog_user FOR LOGIN catalog_login;
GO

-- Grant required permissions
GRANT CONNECT TO catalog_user;
GRANT VIEW DEFINITION TO catalog_user;
ALTER ROLE db_datareader ADD MEMBER catalog_user;
GRANT VIEW DATABASE STATE TO catalog_user; -- optional

Run the Extraction Script

After installing the package, run the following command in your terminal:

castor-extract-sqlserver [arguments]

The script displays logs as it runs:

INFO - Extracting `DATABASE` ...
INFO - Extracting database: query 1/1
INFO - Results stored to /tmp/sqlserver/1706004977-database.csv
...
INFO - Extracting `USER` ...
INFO - Extracting user: query 1/1
INFO - Results stored to /tmp/sqlserver/1706004977-user.csv
INFO - Wrote output file: /tmp/sqlserver/1706004977-summary.json

Arguments

Credentials

Use these arguments to authenticate with your SQL Server instance:

  • -H, --host: MSSQL host address.
  • -P, --port: MSSQL port number.
  • -u, --user: MSSQL username.
  • -p, --password: MSSQL password.

Output

  • -o, --output: Target folder to store the extracted files.

Optional Arguments

  • --skip-existing: Skip files already extracted instead of replacing them.
  • --db-allowed: List of databases that should be extracted.
  • --db-blocked: List of databases that should not be extracted.

The values for --db-allowed and --db-blocked are case-sensitive. Make sure to use the exact values present in column sys.databases.name for the respective databases.

Default Database

  • --default-db: Default database to use when connecting to SQL Server.

This argument is optional and usually not required. The connector automatically discovers and extracts all databases the user has access to.

It becomes necessary in some environments, notably Azure SQL or restricted SQL Server setups, where the user doesn't have access to the master database. In those cases, SQL Server may fail during connection with the following error:

The server principal "user" is not able to access the database "master" under the current security context. (DB-Lib error message 20018)

When this happens, SQL Server attempts to connect to master by default. Setting --default-db forces the connector to connect to a database the user can access, preventing the error.

Use Environment Variables

If you don't want to specify arguments every time, set the following environment variables in your .bashrc:

export CASTOR_SQLSERVER_HOST=127.0.0.0
export CASTOR_SQLSERVER_PORT=3306
export CASTOR_SQLSERVER_USER=extraction_user
export CASTOR_SQLSERVER_PASSWORD=******
export CASTOR_SQLSERVER_DEFAULT_DB=database # optional
export CASTOR_OUTPUT_DIRECTORY="/tmp/catalog"

Then run the script without any arguments:

castor-extract-sqlserver

You can also run the script with partial arguments. The script looks in your environment variables as a fallback:

castor-extract-sqlserver --output /tmp/catalog