JAVA

LATERAL is Your Friend to Create Local Column Variables in SQL – Java, SQL and jOOQ.

The standard SQL WITH clause has been tremendously helpful in structuring SQL queries. Instead of nesting everything in unreadable derived tables like this:

SELECT actor_id, name, COUNT(*)
FROM (
  SELECT actor_id, first_name || ' ' || last_name AS name
  FROM actor
) AS a
JOIN film_actor AS fa USING (actor_id)
GROUP BY actor_id, name
ORDER BY COUNT(*) DESC
LIMIT 5

People have started moving the logic up front, just like in any other programming language, where we declare things first, lexically, then use them:

WITH a AS (
  SELECT actor_id, first_name || ' ' || last_name AS name
  FROM actor
)
SELECT actor_id, name, COUNT(*)
FROM a
JOIN film_actor AS fa USING (actor_id)
GROUP BY actor_id, name
ORDER BY COUNT(*) DESC
LIMIT 5;

Both queries will produce the 5 actors with the most films from the Sakila database:

|actor_id|name          |count|
|--------|--------------|-----|
|107     |GINA DEGENERES|42   |
|102     |WALTER TORN   |41   |
|198     |MARY KEITEL   |40   |
|181     |MATTHEW CARREY|39   |
|23      |SANDRA KILMER |37   |

But something about this is off. We have to create a new table a, and use that table instead of the original actor table. It feels like we’re using the wrong abstraction here to get the result of simply aliasing a column expression.

The examples are still very simple, but we’ve all seen the 500 line SQL monsters where derived tables cascade to several levels deep.

LATERAL to the rescue

The SQL:1999 standard specifies the <lateral derived table>, which is SQL’s way of allowing for a derived table (a subquery in the FROM clause) to access all the lexically preceding objects in the FROM clause. It’s a bit weird in terms of syntax, I personally think that Microsoft SQL Server has a much nicer solution for this concept via APPLY. Oracle supports both syntaxes (standard and T-SQL’s). Db2, Firebird, MySQL, PostgreSQL only have LATERAL.

The main use-case of using LATERAL is to do fancy queries like top-n-per-category queries. But using LATERAL, we can now also move the column alias right where it belongs, conceptually, next to (“laterally”) the JOIN tables in the FROM clause. There are mainly two ways to do this:

As a table list element

Table lists have gone a bit out of fashion except among hard-core Oracle oldschool fans who like to continue using (+) for outer joins, partying like it’s 1989. But with LATERAL, perhaps you might appreciate table lists again? Check this out:

SELECT actor_id, name, COUNT(*)
FROM 
  actor JOIN film_actor AS fa USING (actor_id),
  LATERAL (SELECT first_name || ' ' || last_name AS name) AS t
GROUP BY actor_id, name
ORDER BY COUNT(*) DESC
LIMIT 5;

Isn’t that wonderful? The last element in the FROM clause is a set of local variables derived from the previous FROM clause elements. We can do this cascadingly, too!

SELECT actor_id, name, name_length, COUNT(*)
FROM 
  actor JOIN film_actor AS fa USING (actor_id),
  LATERAL (SELECT first_name || ' ' || last_name AS name) AS t1,
  LATERAL (SELECT length(name) AS name_length) AS t2
GROUP BY actor_id, name, name_length
ORDER BY COUNT(*) DESC
LIMIT 5;

This produces:

|actor_id|name          |name_length|count|
|--------|--------------|-----------|-----|
|107     |GINA DEGENERES|14         |42   |
|102     |WALTER TORN   |11         |41   |
|198     |MARY KEITEL   |11         |40   |
|181     |MATTHEW CARREY|14         |39   |
|23      |SANDRA KILMER |13         |37   |

A join tree element

If you’re not about to resurrect the old table list syntax for this use-case, you can always just CROSS JOIN any <lateral derived table> right where it belongs. For example:

SELECT actor_id, name, COUNT(*)
FROM actor
CROSS JOIN LATERAL (SELECT first_name || ' ' || last_name AS name) AS t
JOIN film_actor AS fa USING (actor_id)
GROUP BY actor_id, name
ORDER BY COUNT(*) DESC
LIMIT 5;

Again, if you require several cascading steps of local variables, just CROSS JOIN more such <lateral derived table> objects:

SELECT actor_id, name, name_length, COUNT(*)
FROM actor
CROSS JOIN LATERAL (SELECT first_name || ' ' || last_name AS name) AS t1
CROSS JOIN LATERAL (SELECT length(name) AS name_length) AS t2
JOIN film_actor AS fa USING (actor_id)
GROUP BY actor_id, name, name_length
ORDER BY COUNT(*) DESC
LIMIT 5;

The tradeoff is simple:

  • WITH helps declare everything up front, before using things. But like ordinary derived tables, they require you to think more about how to nest things.
  • LATERAL helps declare variables right next to the original tables that contain the variable contents, without having to derive those original tables. The rest of the query can still work with the unmodified, underived original table, which makes refactoring things and reasoning about things much easier, at least in my opinion.

I cannot stress enough how useful this is, as you can see in the example. Because the FROM clause is the first clause of your query, in the logical order of operations. That means, anything you declare in FROM can be used everywhere else in the query! The examples are using the column expressions in GROUP BY, for example.

Using T-SQL APPLY

Oracle and SQL Server have a syntax that I personally find more intuitive: APPLY. Because what we’re doing here is we’re applying a function (or subquery, which is a kind of function) to a table, creating a cross product between the table and the function (or subquery) result.

Look at this Oracle example:

SELECT actor_id, name, name_length, COUNT(*)
FROM actor
CROSS APPLY (SELECT first_name || ' ' || last_name AS name FROM dual)
CROSS APPLY (SELECT length(name) AS name_length FROM dual)
JOIN film_actor USING (actor_id)
GROUP BY actor_id, name, name_length
ORDER BY COUNT(*) DESC
FETCH FIRST 5 ROWS ONLY;

It does the exact same thing as the previous CROSS JOIN LATERAL example.

Dialect support

At least the following dialects support either LATERAL or APPLY or both:

  • Db2: LATERAL
  • Firebird: LATERAL
  • MySQL: LATERAL
  • Oracle: LATERAL and APPLY
  • PostgreSQL: LATERAL
  • Snowflake: LATERAL
  • SQL Server: APPLY

Of course, jOOQ supports both syntaxes and can emulate one via the other.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button