Skip to main content

Redshift

Prerequisites

warning

Follow installation instructions here

We strongly advise to create a dedicated user to extract your metadata.

You can follow those instructions to create the catalog user.

danger

This client connects to Redshift using sslmode=verify-ca, which means your certificates must be up-to-date. More information here

Run extraction script

Once the package has been installed, you should be able to run the following command in your terminal:

castor-extract-redshift [arguments]

The script will run and display logs as following:

INFO - Extracting `DATABASE` ...
INFO - Results stored to /tmp/catalog/1649083626-database.csv


...

INFO - Extracting `USER` ...
INFO - Results stored to /tmp/catalog/1649083626-user.csv
INFO - Wrote output file: /tmp/catalog/1649083626-summary.json

Credentials

  • -H, --host: hostname
  • -P, --port: port number
  • -d, --database: database name
  • -u, --user: user
  • -p, --password: password

Other arguments

  • -o, --output: target folder to store the extracted files

Optional arguments

  • --skip-existing: Skip files already extracted instead of replacing them
  • --serverless: Enables extraction for Redshift Serverless
info

You can also get help with argument --help

Use ENV variables

If you don't want to specify arguments every time, you can set the following ENV in your .bashrc:

export CASTOR_REDSHIFT_HOST=127.0.0.0
export CASTOR_REDSHIFT_PORT=5439
export CASTOR_REDSHIFT_DATABASE=db_name
export CASTOR_REDSHIFT_USER=extraction_user
export CASTOR_REDSHIFT_PASSWORD=******

# Optional to enable Redshift Serverless
CASTOR_REDSHIFT_SERVERLESS=TRUE

export CASTOR_OUTPUT_DIRECTORY="/tmp/catalog"

Then the script can be executed without any arguments:

castor-extract-redshift

It can also be executed with partial arguments (the script looks in your ENV as a fallback):

castor-extract-redshift --output /tmp/catalog