Oracle’s way to get multiple values in a top 1 per group query
I’ve blogged about generic ways of getting top 1 or top n per category queries before on this blog.
An Oracle specific version in that post used the arcane KEEP
syntax:
SELECT
max(actor_id) KEEP (DENSE_RANK FIRST ORDER BY c DESC, actor_id),
max(first_name) KEEP (DENSE_RANK FIRST ORDER BY c DESC, actor_id),
max(last_name) KEEP (DENSE_RANK FIRST ORDER BY c DESC, actor_id),
max(c) KEEP (DENSE_RANK FIRST ORDER BY c DESC, actor_id)
FROM (
SELECT actor_id, first_name, last_name, count(film_id) c
FROM actor
LEFT JOIN film_actor USING (actor_id)
GROUP BY actor_id, first_name, last_name
) t;
This is a bit difficult to read when you see it for the first time. Think of it as a complicated way to say you want to get the first value per group. This hypothetical syntax would be much nicer:
SELECT
FIRST(actor_id ORDER BY c DESC, actor_id),
FIRST(first_name ORDER BY c DESC, actor_id),
FIRST(last_name ORDER BY c DESC, actor_id),
FIRST(c ORDER BY c DESC, actor_id)
FROM (...) t;
So, we’re getting the FIRST
value of an expression per group when we order the group contents by the ORDER BY
clause.
Oracle’s syntax takes into account that ordering may be non-deterministic, leading to ties if you don’t include a unique value in the
ORDER BY
clause. In that case, you can aggregate all the ties, e.g. to get anAVG()
if that makes sense in your business case. If you don’t care about ties, or ensure there are no ties,MAX()
is an OK workaround, or since 21c,ANY_VALUE()
Now, there’s quite a bit of repetition when you’re projecting multiple columns per group like that. Window functions have a WINDOW
clause, where common window specifications can be named for repeated use. But GROUP BY
doesn’t have such a feature, probably because only few cases arise where this would be useful.
But luckily, Oracle has:
OBJECT
types, which are just nominally typed row value expressionsANY_VALUE
, an aggregate function that generates any value per group, which has been added in Oracle 21c
With these two utilities, we can do this:
CREATE TYPE o AS OBJECT (
actor_id NUMBER(18),
first_name VARCHAR2(50),
last_name VARCHAR2(50),
c NUMBER(18)
);
And now:
SELECT
ANY_VALUE(o(actor_id, first_name, last_name, c))
KEEP (DENSE_RANK FIRST ORDER BY c DESC, actor_id)
FROM (...) t;
Note, it would be possible to use MAX()
in older Oracle versions, if you work around this error message as well:
ORA-22950: cannot order objects without MAP or ORDER method
This is just a workaround, of course. It’s tedious to manage named OBJECT
types like that for every case of aggregation. If you don’t need the type safety, you can always also just use JSON instead:
SELECT
ANY_VALUE(JSON_OBJECT(actor_id, first_name, last_name, c))
KEEP (DENSE_RANK FIRST ORDER BY c DESC, actor_id)
FROM (...) t;