Available in versions: Dev (3.19) | Latest (3.18) | 3.17 | 3.16 | 3.15 | 3.14 | 3.13 | 3.12 | 3.11 | 3.10 | 3.9
Grouping
Applies to ✅ Open Source Edition ✅ Express Edition ✅ Professional Edition ✅ Enterprise Edition
Aggregate functions aggregate data from groups of data into individual values. There are three main ways of forming such groups:
- A GROUP BY clause is used to define the groups for which data is aggregated
- No GROUP BY clause is defined, which means that all data from a SELECT statement (or subquery) is aggregated into a single row
- All aggregate functions can be used as window functions, in case of which they will aggregate the data of the specified window
Aggregation with GROUP BY
In the presence of GROUP BY, a SELECT statement transforms the output of its FROM clause into a new "virtual" set of tuples containing:
- The column expressions of the
GROUP BY
clause. In the overall data set, the values of these column expressions is unique. - A set of data corresponding to each row produced by the
GROUP BY
clause. This data set can be aggregated per group using aggregate functions.
Using GROUP BY
means that a new set of rules need to be observed in the rest of the query:
- Clauses that logically precede GROUP BY are not affected. These include, for example, FROM and WHERE
- All other clauses (e.g. HAVING, WINDOW, SELECT, or ORDER BY) may now only reference expressions built from the expressions in the
GROUP BY
clause, or aggregations on any other expression
An example:
SELECT AUTHOR_ID, count(*) FROM BOOK GROUP BY AUTHOR_ID;
create.select(BOOK.AUTHOR_ID, count()) .from(BOOK) .groupBy(BOOK.AUTHOR_ID).fetch();
Producing:
+-----------+-------+ | AUTHOR_ID | count | +-----------+-------+ | 1 | 2 | | 2 | 2 | +-----------+-------+
Per the rules imposed by GROUP BY
, it would not be possible, for example, to project the BOOK.TITLE
column, because it is not defined per author. An author has written many books, so we don't know what a BOOK.TITLE
is supposed to mean. Only an aggregation, such as LISTAGG or ARRAY_AGG can reference BOOK.TITLE
as an argument.
Aggregation without GROUP BY
In the absence of GROUP BY, a SELECT statement that contains at least one aggregate function in any of its clauses (e.g. HAVING, WINDOW, SELECT, or ORDER BY) will proceed to aggregating the entire data into a single row. There is an implied "empty grouping", i.e. a grouping that has no GROUP BY
columns. These two are the same things:
SELECT count(*) FROM BOOK; SELECT count(*) FROM BOOK GROUP BY ();
See also GROUPING SETS for more details about this empty GROUP BY
syntax.
For example, using our sample database, which has 4 books with IDs 1-4, we can write:
SELECT count(*), sum(ID) FROM BOOK
create.select(count(), sum(BOOK.ID)) .from(BOOK).fetch();
Producing:
+----------+---------+ | count(*) | sum(ID) | +----------+---------+ | 4 | 10 | +----------+---------+
No other columns from the tables in the FROM clause may be projected by the SELECT clause, because they would not be defined for this single group. For example, no specific BOOK.TITLE
is defined for the aggregated value of all books. Only an aggregation, such as LISTAGG or ARRAY_AGG can reference BOOK.TITLE
as an argument.
However, any expression whose components do not depend on content of the group is allowed. For example, it is possible to combine aggregate functions and constant expressions like this:
SELECT count(*) + sum(ID) + 1 FROM BOOK
create.select(count().plus(sum(BOOK.ID)).plus(1)) .from(BOOK).fetch();
Producing:
+------+ | plus | +------+ | 15 | +------+
Feedback
Do you have any feedback about this page? We'd love to hear it!