All versions: 3.11 | 3.10 | 3.9 | 3.8 | 3.7 | 3.6 | 3.5 | 3.4 | 3.3 | Development versions: 3.12 | Unsupported versions: 3.2 | 3.1 | 3.0 | 2.6

ROLLUP() explained in SQL

The SQL standard defines special functions that can be used in the GROUP BY clause: the grouping functions. These functions can be used to generate several groupings in a single clause. This can best be explained in SQL. Let's take ROLLUP() for instance:

-- ROLLUP() with one argument
SELECT AUTHOR_ID, COUNT(*)
FROM BOOK
GROUP BY ROLLUP(AUTHOR_ID)


-- ROLLUP() with two arguments
SELECT AUTHOR_ID, PUBLISHED_IN, COUNT(*)
FROM BOOK
GROUP BY ROLLUP(AUTHOR_ID, PUBLISHED_IN)







-- The same query using UNION ALL:
  SELECT AUTHOR_ID, COUNT(*) FROM BOOK GROUP BY (AUTHOR_ID)
UNION ALL
  SELECT NULL, COUNT(*) FROM BOOK GROUP BY ()
ORDER BY 1 NULLS LAST

-- The same query using UNION ALL:
  SELECT AUTHOR_ID, PUBLISHED_IN, COUNT(*)
  FROM BOOK GROUP BY (AUTHOR_ID, PUBLISHED_IN)
UNION ALL
  SELECT AUTHOR_ID, NULL, COUNT(*)
  FROM BOOK GROUP BY (AUTHOR_ID)
UNION ALL
  SELECT NULL, NULL, COUNT(*)
  FROM BOOK GROUP BY ()
ORDER BY 1 NULLS LAST, 2 NULLS LAST

In English, the ROLLUP() grouping function provides N+1 groupings, when N is the number of arguments to the ROLLUP() function. Each grouping has an additional group field from the ROLLUP() argument field list. The results of the second query might look something like this:

+-----------+--------------+----------+
| AUTHOR_ID | PUBLISHED_IN | COUNT(*) |
+-----------+--------------+----------+
|         1 |         1945 |        1 | <- GROUP BY (AUTHOR_ID, PUBLISHED_IN)
|         1 |         1948 |        1 | <- GROUP BY (AUTHOR_ID, PUBLISHED_IN)
|         1 |         NULL |        2 | <- GROUP BY (AUTHOR_ID)
|         2 |         1988 |        1 | <- GROUP BY (AUTHOR_ID, PUBLISHED_IN)
|         2 |         1990 |        1 | <- GROUP BY (AUTHOR_ID, PUBLISHED_IN)
|         2 |         NULL |        2 | <- GROUP BY (AUTHOR_ID)
|      NULL |         NULL |        4 | <- GROUP BY ()
+-----------+--------------+----------+

CUBE() explained in SQL

CUBE() is different from ROLLUP() in the way that it doesn't just create N+1 groupings, it creates all 2^N possible combinations between all group fields in the CUBE() function argument list. Let's re-consider our second query from before:

-- CUBE() with two arguments
SELECT AUTHOR_ID, PUBLISHED_IN, COUNT(*)
FROM BOOK
GROUP BY CUBE(AUTHOR_ID, PUBLISHED_IN)










-- The same query using UNION ALL:
  SELECT AUTHOR_ID, PUBLISHED_IN, COUNT(*)
  FROM BOOK GROUP BY (AUTHOR_ID, PUBLISHED_IN)
UNION ALL
  SELECT AUTHOR_ID, NULL, COUNT(*)
  FROM BOOK GROUP BY (AUTHOR_ID)
UNION ALL
  SELECT NULL, PUBLISHED_IN, COUNT(*)
  FROM BOOK GROUP BY (PUBLISHED_IN)
UNION ALL
  SELECT NULL, NULL, COUNT(*)
  FROM BOOK GROUP BY ()
ORDER BY 1 NULLS FIRST, 2 NULLS FIRST

The results would then hold:

+-----------+--------------+----------+
| AUTHOR_ID | PUBLISHED_IN | COUNT(*) |
+-----------+--------------+----------+
|      NULL |         NULL |        2 | <- GROUP BY ()
|      NULL |         1945 |        1 | <- GROUP BY (PUBLISHED_IN)
|      NULL |         1948 |        1 | <- GROUP BY (PUBLISHED_IN)
|      NULL |         1988 |        1 | <- GROUP BY (PUBLISHED_IN)
|      NULL |         1990 |        1 | <- GROUP BY (PUBLISHED_IN)
|         1 |         NULL |        2 | <- GROUP BY (AUTHOR_ID)
|         1 |         1945 |        1 | <- GROUP BY (AUTHOR_ID, PUBLISHED_IN)
|         1 |         1948 |        1 | <- GROUP BY (AUTHOR_ID, PUBLISHED_IN)
|         2 |         NULL |        2 | <- GROUP BY (AUTHOR_ID)
|         2 |         1988 |        1 | <- GROUP BY (AUTHOR_ID, PUBLISHED_IN)
|         2 |         1990 |        1 | <- GROUP BY (AUTHOR_ID, PUBLISHED_IN)
+-----------+--------------+----------+

GROUPING SETS()

GROUPING SETS() are the generalised way to create multiple groupings. From our previous examples

  • ROLLUP(AUTHOR_ID, PUBLISHED_IN) corresponds to GROUPING SETS((AUTHOR_ID, PUBLISHED_IN), (AUTHOR_ID), ())
  • CUBE(AUTHOR_ID, PUBLISHED_IN) corresponds to GROUPING SETS((AUTHOR_ID, PUBLISHED_IN), (AUTHOR_ID), (PUBLISHED_IN), ())

This is nicely explained in the SQL Server manual pages about GROUPING SETS() and other grouping functions:
http://msdn.microsoft.com/en-us/library/bb510427(v=sql.105)

jOOQ's support for ROLLUP(), CUBE(), GROUPING SETS()

jOOQ fully supports all of these functions, as well as the utility functions GROUPING() and GROUPING_ID(), used for identifying the grouping set ID of a record. The DSL API thus includes:

// The various grouping function constructors
GroupField rollup(Field<?>... fields);
GroupField cube(Field<?>... fields);
GroupField groupingSets(Field<?>... fields);
GroupField groupingSets(Field<?>[]... fields);
GroupField groupingSets(Collection<? extends Field<?>>... fields);

// The utility functions generating IDs per GROUPING SET
Field<Integer> grouping(Field<?>);
Field<Integer> groupingId(Field<?>...);

MySQL's and CUBRID's WITH ROLLUP syntax

MySQL and CUBRID don't know any grouping functions, but they support a WITH ROLLUP clause, that is equivalent to simple ROLLUP() grouping functions. jOOQ emulates ROLLUP() in MySQL and CUBRID, by rendering this WITH ROLLUP clause. The following two statements mean the same:

-- Statement 1: SQL standard
GROUP BY ROLLUP(A, B, C)

-- Statement 2: SQL standard
GROUP BY A, ROLLUP(B, C)
-- Statement 1: MySQL
GROUP BY A, B, C WITH ROLLUP

-- Statement 2: MySQL
-- This is not supported in MySQL
The jOOQ Logo