Skip to main content

Amazon Athena and Glue

Connect Coalesce Catalog to metadata in the AWS Glue Data Catalog and, when you want richer analytics in Catalog, to optional Amazon Athena and AWS CloudTrail signals.

Before You Begin

You need the following:

  • Permission to create IAM policies and an IAM user in AWS, or another programmatic credential source your organization approves
  • The AWS account ID and Region where Glue and, if used, Athena are configured
  • Access to your Catalog instance to enter connection details

How Catalog Uses AWS

Catalog reads warehouse metadata through the Glue API so it can list databases, tables, and columns. That path is the foundation for documenting tables and columns in Catalog.

When you grant additional permissions, Catalog can also use Athena APIs that list and describe query executions and work groups. That supports features such as query popularity, lineage, and other SQL-oriented signals that depend on Athena query metadata. If those permissions are missing, Catalog can still ingest Glue metadata while query-related depth may be reduced or absent.

When you allow CloudTrail LookupEvents, Catalog can enrich query activity with information about who ran queries. If CloudTrail lookup isn't allowed, metadata ingestion and many query-listing flows can still work. User-centric enrichment may be limited or omitted.

For query observation, Catalog uses read-oriented Athena APIs. You don't need to grant athena:StartQueryExecution or athena:GetQueryResults for Catalog to read execution metadata using that design.

Glue action coverage

AWS resource types such as partitions, views, and user-defined functions can require specific Glue read actions. The policies below keep the same Glue read actions that appeared in earlier Coalesce guides. If onboarding fails for uncommon objects, compare your resources against AWS Glue IAM documentation and adjust with your security team.

Create a Catalog User

Create an IAM identity for programmatic access before you attach the JSON policy you build in Choose an IAM Scope.

  1. Create an IAM user for Catalog using AWS IAM user creation instructions.
  2. After you build an IAM policy in Choose an IAM Scope, attach that policy to this user.
  3. Create an access key for programmatic use and store the secret securely.

If Needed, Allowlist Catalog IP

Allow outbound connections from Catalog to your AWS APIs from the fixed IP that matches your Catalog host:

Choose an IAM Scope

Use this section to decide how much access to grant:

Optional scopes change how deep analytics go. They don't necessarily change whether basic tables and columns appear when Glue read access succeeds.

Glue Metadata Only

This JSON grants read access to Glue Data Catalog objects. Replace <region> and <account_id> with your values. Set "Version" to 2012-10-17, which is the standard value for IAM policy documents.

Replace placeholders

Replace every <region> and <account_id> placeholder before you save the policy.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetTable",
"glue:GetTables",
"glue:GetPartition",
"glue:GetPartitions",
"glue:BatchGetPartition",
"glue:SearchTables",
"glue:GetTableVersions",
"glue:GetTableVersion",
"glue:GetUserDefinedFunctions",
"glue:GetUserDefinedFunction"
],
"Resource": [
"arn:aws:glue:<region>:<account_id>:tableVersion/*/*/*",
"arn:aws:glue:<region>:<account_id>:table/*/*",
"arn:aws:glue:<region>:<account_id>:catalog",
"arn:aws:glue:<region>:<account_id>:database/*"
]
}
]
}

Add Athena Query Activity

Append these statements to the policy you use for Glue metadata, or merge them into a single document with your Glue statement. They don't start queries or fetch result rows in S3; they allow listing and describing executions and work groups.

{
"Effect": "Allow",
"Action": [
"athena:ListWorkGroups"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"athena:GetWorkGroup",
"athena:ListQueryExecutions",
"athena:GetQueryExecution",
"athena:BatchGetQueryExecution",
"athena:ListDatabases",
"athena:GetDatabase",
"athena:GetDataCatalog",
"athena:ListTableMetadata",
"athena:GetTableMetadata"
],
"Resource": [
"arn:aws:athena:<region>:<account_id>:datacatalog/*",
"arn:aws:athena:<region>:<account_id>:workgroup/*"
]
}

Remember to substitute <region> and <account_id> wherever those placeholders appear in the policy.

Add CloudTrail Lookup

Append this statement to allow Catalog to call LookupEvents for principal enrichment:

{
"Effect": "Allow",
"Action": [
"cloudtrail:LookupEvents"
],
"Resource": [
"*"
]
}

Full Policy Examples

The following examples combine the pieces above so you can copy a single policy when you know your target scope.

Glue With Athena Query Activity

This example keeps Glue catalog reads and Athena query observation in one policy you can paste as a single document.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetTable",
"glue:GetTables",
"glue:GetPartition",
"glue:GetPartitions",
"glue:BatchGetPartition",
"glue:SearchTables",
"glue:GetTableVersions",
"glue:GetTableVersion",
"glue:GetUserDefinedFunctions",
"glue:GetUserDefinedFunction"
],
"Resource": [
"arn:aws:glue:<region>:<account_id>:tableVersion/*/*/*",
"arn:aws:glue:<region>:<account_id>:table/*/*",
"arn:aws:glue:<region>:<account_id>:catalog",
"arn:aws:glue:<region>:<account_id>:database/*"
]
},
{
"Effect": "Allow",
"Action": [
"athena:ListWorkGroups"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"athena:GetWorkGroup",
"athena:ListQueryExecutions",
"athena:GetQueryExecution",
"athena:BatchGetQueryExecution",
"athena:ListDatabases",
"athena:GetDatabase",
"athena:GetDataCatalog",
"athena:ListTableMetadata",
"athena:GetTableMetadata"
],
"Resource": [
"arn:aws:athena:<region>:<account_id>:datacatalog/*",
"arn:aws:athena:<region>:<account_id>:workgroup/*"
]
}
]
}

Glue With Athena Query Activity and CloudTrail

This example adds CloudTrail lookup for principal enrichment on top of Glue metadata and Athena query observation.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetTable",
"glue:GetTables",
"glue:GetPartition",
"glue:GetPartitions",
"glue:BatchGetPartition",
"glue:SearchTables",
"glue:GetTableVersions",
"glue:GetTableVersion",
"glue:GetUserDefinedFunctions",
"glue:GetUserDefinedFunction"
],
"Resource": [
"arn:aws:glue:<region>:<account_id>:tableVersion/*/*/*",
"arn:aws:glue:<region>:<account_id>:table/*/*",
"arn:aws:glue:<region>:<account_id>:catalog",
"arn:aws:glue:<region>:<account_id>:database/*"
]
},
{
"Effect": "Allow",
"Action": [
"athena:ListWorkGroups"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"athena:GetWorkGroup",
"athena:ListQueryExecutions",
"athena:GetQueryExecution",
"athena:BatchGetQueryExecution",
"athena:ListDatabases",
"athena:GetDatabase",
"athena:GetDataCatalog",
"athena:ListTableMetadata",
"athena:GetTableMetadata"
],
"Resource": [
"arn:aws:athena:<region>:<account_id>:datacatalog/*",
"arn:aws:athena:<region>:<account_id>:workgroup/*"
]
},
{
"Effect": "Allow",
"Action": [
"cloudtrail:LookupEvents"
],
"Resource": "*"
}
]
}

Add Your Connection Details on Catalog

On the Catalog integration screen for Amazon Athena and AWS Glue, enter credentials in this format:

{
"aws_region": "<your_region>",
"aws_account_id": "<your_account>",
"access_key_id": "<your_key_id>",
"access_key_secret": "<your_secret>"
}

Replace each placeholder with your real Region, account ID, and key material.

What's Next?