Skip to main content

How to use GROUP BY

GROUP BY is completed in Coalesce by adding your GROUP BY clause after the FROM statement, as well as after any other precursor statements like JOINs and/or a WHERE clause, in the Join tab of your Node Editor of the Node where you will complete the GROUP BY.

This image presents the join configuration for the same table, FCT_TOTAL_SALES_BY_STORE. The SQL snippet shows a query that selects data from the MONTHLY_SALES table, with a commented-out join to another table DIM_STORE. The query groups the results by the STORE column and filters out rows where the STORE value is null.

You will have to ensure your columns are configured correctly in your Mapping tab in terms of aggregation Transforms on column you wish to aggregate and Sources for columns you wish to GROUP BY in order for the GROUP BY to execute successfully.

This image displays a mapping configuration for a table named FCT_TOTAL_SALES_BY_STORE. It shows columns like STORE, AMOUNT, SYSTEM_CREATE_DATE, and SYSTEM_UPDATE_DATE with their respective transformations and data types. The STORE and AMOUNT columns are sourced from the MONTHLY_SALES table, while the date columns use current timestamps.

GROUP BY Detailed Example

To provide a more detailed example, let's say you have a raw table of monthly sales amounts by store and month that you'd like to transform and load into a Fact Table of sales amounts by store in total. But rather than loading the actual store value into your Fact Table, you want to pull in the corresponding dimension key for each store from your Store Dimension, so you also need to join to your Store Dimension while completing this transform and load.

This image demonstrates the aggregation process of sales data by store and month. On the left, there is a table listing STORE, AMOUNT, and MONTH, showing individual sales records. Below this, another table shows the DIM_STORE_KEY and STORE mapping. On the right, the result of the aggregation is displayed in a table with DIM_STORE_KEY and the total AMOUNT, showing summed amounts for each store key.
  1. Add a Fact Node from your main source (MONTHLY_SALES).

    The image shows the FCT_MONTHLY_SALES table mapping in a data integration or ETL tool. The table consists of columns: STORE, AMOUNT, MONTH, SYSTEM_CREATE_DATE, and SYSTEM_UPDATE_DATE, with respective data types and transformation rules. The data source is indicated as MONTHLY_SALES, and all columns are nullable.
  2. Remove the columns you don't want to include in your new Fact (STORE in its raw form and MONTH) and configure your aggregations on your columns.

    The image shows the FCT_MONTHLY_SALES table mapping with three columns: AMOUNT, SYSTEM_CREATE_DATE, and SYSTEM_UPDATE_DATE. The AMOUNT column is transformed using the SUM function, while both date columns are cast to TIMESTAMP. The data source is indicated as MONTHLY_SALES for the AMOUNT column, and all columns are nullable.
  3. Map in the key from your Store Dimension (DIM_STORE).

    The image shows the FCT_MONTHLY_SALES table mapping with four columns: DIM_STORE_KEY, AMOUNT, SYSTEM_CREATE_DATE, and SYSTEM_UPDATE_DATE. The DIM_STORE_KEY column is sourced from DIM_STORE, the AMOUNT column is transformed using the SUM function, and both date columns are cast to TIMESTAMP. All columns are nullable, and the data sources for AMOUNT and DIM_STORE_KEY are MONTHLY_SALES and DIM_STORE, respectively.
  4. Configure your join between your main source (MONTHLY_SALES) and your Store Dimension (DIM_STORE).

    The image shows the join condition for the FCT_MONTHLY_SALES table in a data integration or ETL tool. The join is performed between the MONTHLY_SALES source table and the DIM_STORE target table. The join condition matches the STORE column from the MONTHLY_SALES table with the STORE column from the DIM_STORE table.
  5. Add a GROUP BY clause after your join on your Store Dimension key (DIM_STORE.DIM_STORE_KEY).

    The image shows the SQL query for the FCT_MONTHLY_SALES table join and group by operations in a data integration or ETL tool. The query joins the MONTHLY_SALES source table with the DIM_STORE target table on the STORE column. It groups the results by the DIM_STORE_KEY column from the DIM_STORE table.
  6. Within the Node Editor:

    1. Validate Select.
    2. Validate Create.
    3. Create.
    4. Validate Run.
    5. Fetch data, and view grouped results.
The image displays the results of a query on the FCT_MONTHLY_SALES table, showing grouped data. The table includes columns DIM_STORE_KEY, AMOUNT, SYSTEM_CREATE_DATE, and SYSTEM_UPDATE_DATE. The results show two records with DIM_STORE_KEY values of 1 and 2, corresponding amounts of 47400 and 147500, and identical system create and update dates.