SQL Server
Extract metadata from your SQL Server databases using the Coalesce extraction script.
Prerequisites
Before you begin, complete the following:
Install Castor Extract
Follow the installation instructions on PyPi.
Create a User
Create a dedicated user to extract your metadata.
To create a SQL Server login and database user, you must connect using an account with one of these permission sets:
sysadminsecurityadminat the server level anddb_owneron each target database
Permissions Required by the Catalog User
The Catalog integration requires read-only access to SQL Server metadata. The Catalog user must not have write or administrative permissions.
Server-Level Permissions
Grant the following permission to allow the Catalog user to discover databases through sys.databases:
GRANT VIEW ANY DATABASE TO <login_name>;
Database-Level Permissions
Repeat the following steps for each database Catalog should extract metadata from.
First, switch to the target database:
USE <database_name>;
GO
Then grant the required permissions:
- CONNECT: Allows connection to the database.
- VIEW DEFINITION: Enables reading schemas, tables, columns, views, and view definitions.
db_datareader: Provides read access to metadata from system and information schema views.
GRANT CONNECT TO <user_name>;
GRANT VIEW DEFINITION TO <user_name>;
ALTER ROLE db_datareader ADD MEMBER <user_name>;
Query Store Access for Query Lineage
If Query Store extraction is enabled, grant the following permission to allow access to sys.query_store_* system views:
GRANT VIEW DATABASE STATE TO <user_name>;
Schema Usage for Restricted Environments
If your environment restricts schema access, grant USAGE on required schemas:
GRANT USAGE ON SCHEMA::dbo TO <user_name>;
-- Repeat for other schemas as needed
Example: Create Catalog Login and User
-- Create login
CREATE LOGIN catalog_login WITH PASSWORD = 'StrongPasswordHere!';
GO
-- Create user in a target database
USE <database_name>;
GO
CREATE USER catalog_user FOR LOGIN catalog_login;
GO
-- Grant required permissions
GRANT CONNECT TO catalog_user;
GRANT VIEW DEFINITION TO catalog_user;
ALTER ROLE db_datareader ADD MEMBER catalog_user;
GRANT VIEW DATABASE STATE TO catalog_user; -- optional
Run the Extraction Script
After installing the package, run the following command in your terminal:
castor-extract-sqlserver [arguments]
The script displays logs as it runs:
INFO - Extracting `DATABASE` ...
INFO - Extracting database: query 1/1
INFO - Results stored to /tmp/sqlserver/1706004977-database.csv
...
INFO - Extracting `USER` ...
INFO - Extracting user: query 1/1
INFO - Results stored to /tmp/sqlserver/1706004977-user.csv
INFO - Wrote output file: /tmp/sqlserver/1706004977-summary.json
Arguments
Credentials
Use these arguments to authenticate with your SQL Server instance:
- -H, --host: MSSQL host address.
- -P, --port: MSSQL port number.
- -u, --user: MSSQL username.
- -p, --password: MSSQL password.
Output
- -o, --output: Target folder to store the extracted files.
Optional Arguments
- --skip-existing: Skip files already extracted instead of replacing them.
- --db-allowed: List of databases that should be extracted.
- --db-blocked: List of databases that should not be extracted.
The values for --db-allowed and --db-blocked are case-sensitive. Make sure to use the exact values present in column sys.databases.name for the respective databases.
Default Database
- --default-db: Default database to use when connecting to SQL Server.
This argument is optional and usually not required. The connector automatically discovers and extracts all databases the user has access to.
It becomes necessary in some environments, notably Azure SQL or restricted SQL Server setups, where the user doesn't have access to the master database. In those cases, SQL Server may fail during connection with the following error:
The server principal "user" is not able to access the database "master" under the current security context. (DB-Lib error message 20018)
When this happens, SQL Server attempts to connect to master by default. Setting --default-db forces the connector to connect to a database the user can access, preventing the error.
Use Environment Variables
If you don't want to specify arguments every time, set the following environment variables in your .bashrc:
export CASTOR_SQLSERVER_HOST=127.0.0.0
export CASTOR_SQLSERVER_PORT=3306
export CASTOR_SQLSERVER_USER=extraction_user
export CASTOR_SQLSERVER_PASSWORD=******
export CASTOR_SQLSERVER_DEFAULT_DB=database # optional
export CASTOR_OUTPUT_DIRECTORY="/tmp/catalog"
Then run the script without any arguments:
castor-extract-sqlserver
You can also run the script with partial arguments. The script looks in your environment variables as a fallback:
castor-extract-sqlserver --output /tmp/catalog