How to implement FILTER semantics with Oracle JSON aggregate functions
A cool standard SQL:2003 feature is the aggregate FILTER
clause, which is supported natively by at least these RDBMS:
- ClickHouse
- CockroachDB
- DuckDB
- Firebird
- H2
- HSQLDB
- PostgreSQL
- SQLite
- Trino
- YugabyteDB
The following aggregate function computes the number of rows per group which satifsy the FILTER
clause:
SELECT
COUNT(*) FILTER (WHERE BOOK.TITLE LIKE 'A%'),
COUNT(*) FILTER (WHERE BOOK.TITLE LIKE 'B%'),
...
FROM BOOK
This is useful for pivot style queries, where multiple aggregate values are computed in one go. For most basic types of aggregate function, it can be emulated simply by using CASE
expressions, because standard aggregate functions ignore NULL
values when aggregating. The following is equivalent to the above, in all RDBMS:
SELECT
COUNT(CASE WHEN BOOK.TITLE LIKE 'A%' THEN 1 END),
COUNT(CASE WHEN BOOK.TITLE LIKE 'B%' THEN 1 END),
...
FROM BOOK
What if we’re aggregating JSON?
Things are a bit different when aggregating JSON. Look at the following example, where we don’t want to count the books, but list them in a JSON array, or object:
SELECT
JSON_ARRAYAGG(BOOK.TITLE)
FILTER (WHERE BOOK.LANGUAGE_ID = 1),
JSON_OBJECTAGG('id-' || BOOK.ID, BOOK.TITLE)
FILTER (WHERE BOOK.LANGUAGE_ID = 2),
...
FROM BOOK
Things are different with these collection aggregate functions, because NULL
values are actually interesting there, so we want to list them in the resulting JSON document. Assuming there are books with a NULL
title, we might get:
|JSON_ARRAYAGG |JSON_OBJECTAGG |
|-----------------------------|------------------------------------|
|["1984", "Animal Farm", null]| "id-4" : "Brida", "id-17" : null |
This makes emulating the FILTER
clause (e.g. on Oracle) much harder, because we cannot just use ABSENT ON NULL
like this:
SELECT
JSON_ARRAYAGG(
CASE WHEN T_BOOK.LANGUAGE_ID = 1 THEN T_BOOK.TITLE END
ABSENT ON NULL
),
JSON_OBJECTAGG(
'id-' || T_BOOK.ID,
CASE WHEN T_BOOK.LANGUAGE_ID = 2 THEN T_BOOK.TITLE END
ABSENT ON NULL
)
FROM T_BOOK;
Because now, the legitimate null
titled books are missing and we’re getting this instead:
|JSON_ARRAYAGG |JSON_OBJECTAGG |
|----------------------|----------------|
|["1984","Animal Farm"]|"id-4":"Brida"|
We cannot use NULL ON NULL
either, because that would just turn the FILTER
semantics into a mapping semantics, and produce too many values:
|JSON_ARRAYAGG |JSON_OBJECTAGG |
|-------------------------------------|-----------------------------------------------------------------|
|["1984","Animal Farm",null,null,null]|"id-1":null,"id-4":"Brida","id-3":null,"id-2":null,"id-17":null|
E.g. while id-3
and id-2
values are NULL
because the FILTER
emulating CASE
expression maps them to NULL
, the id-17
value really has a NULL
title.
Workaround: Wrap data in an array
As a workaround, we can:
- Wrap legitimate data into an array
- Apply
ABSENT ON NULL
to remove rows due to theFILTER
emulation - Unwrap data again from the array
For the unwrapping, we’re going to be using JSON_TRANSFORM
:
SELECT
JSON_TRANSFORM(
JSON_ARRAYAGG(
CASE
WHEN T_BOOK.LANGUAGE_ID = 1
-- Wrap legitimate data into an array, including nulls
THEN JSON_ARRAY(T_BOOK.TITLE NULL ON NULL)
END
-- Remove NULLs due to FILTER emulation
ABSENT ON NULL
),
-- Unwrap data gain from the array
NESTED PATH '$[*]' (REPLACE '@' = PATH '@[0]')
),
JSON_TRANSFORM(
JSON_OBJECTAGG(
'id-' || T_BOOK.ID,
CASE
WHEN T_BOOK.LANGUAGE_ID = 2
-- Wrap legitimate data into an array, including nulls
THEN JSON_ARRAY(T_BOOK.TITLE NULL ON NULL)
END
-- Remove NULLs due to FILTER emulation
ABSENT ON NULL
),
-- Unwrap data gain from the array
NESTED PATH '$.*' (REPLACE '@' = PATH '@[0]')
)
FROM T_BOOK;
jOOQ support
jOOQ 3.20 will implement the above emulations for:
This way, you can continue to transparently use FILTER
on any aggregate function, also in Oracle.