The Many Ways to Return Data From SQL DML – Java, SQL and jOOQ.
Probably the hardest thing to standardise in SQL is RETURNING
data from DML statements. In this article, we’ll look at various ways of doing that with jOOQ, in many of jOOQ’s supported dialects, and with JDBC directly.
How to do it with jOOQ
Assuming the usual table from the sakila database:
CREATE TABLE actor (
id INT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
first_name TEXT,
last_name TEXT,
last_update TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
jOOQ took syntactic inspiration from Firebird, MariaDB, PostgreSQL, Oracle PL/SQL, which have quite the intuitive syntax. On any DML statement (i.e. INSERT
, UPDATE
, DELETE
), you can just append RETURNING
, like this:
INSERT INTO actor (first_name, last_name)
VALUES ('John', 'Doe')
RETURNING id, last_update
In summary, the RETURNING
clause acts like any projection, i.e. what you usually do with SELECT
. That means, you can also just return everything:
INSERT INTO actor (first_name, last_name)
VALUES ('John', 'Doe')
RETURNING *
So, in jOOQ, this just translates to
ActorRecord actor
ctx.insertInto(ACTOR, ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
.values("John", "Doe")
.returning()
.fetchOne();
Alternatively, use returningResult()
to return arbitrary projections. We’ve already seen how this is translated to:
- Firebird
- MariaDB
- PostgreSQL (and related dialects, such as CockroachDB, YugabyteDB)
- Oracle PL/SQL
Let’s look into some details of how others support the syntax:
How this is supported in PL/SQL
While Oracle PL/SQL supports the syntax per se, it isn’t exactly equivalent to PostgreSQL’s. First off, when jOOQ knows that it’s only inserting a single row, then it delegates the returning of data to the JDBC driver via Statement.RETURN_GENERATED_KEYS
. So the generated SQL doesn’t have any trace of the RETURNING
clause (although ojdbc will add it later on). See below for more details.
But when jOOQ doesn’t know the number of rows, or when it knows you’re inserting more than one row, then it falls back to a much more sophisticated emulation (actual contents may differ):
DECLARE
-- Variables for input data
i0 DBMS_SQL.VARCHAR2_TABLE;
i1 DBMS_SQL.VARCHAR2_TABLE;
-- Variables for output data
o0 DBMS_SQL.VARCHAR2_TABLE;
o1 DBMS_SQL.TIMESTAMP_TABLE;
c0 sys_refcursor;
c1 sys_refcursor;
BEGIN
-- Input data
i0(1) := ?;
i0(2) := ?;
i1(1) := ?;
i1(2) := ?;
-- Use Oracle's FORALL statement for bulk insertion
FORALL i IN 1 .. i0.count
INSERT INTO actor (first_name, last_name)
VALUES (i0(i), i1(i))
RETURNING id, last_update
BULK COLLECT INTO o0, o1;
-- Fetch the update count
? := sql%rowcount;
-- Fetch the returned data
OPEN c0 FOR SELECT * FROM table(o0);
OPEN c1 FOR SELECT * FROM table(o1);
? := c0;
? := c1;
END;
It’s quite the laborious task to generate this SQL, as well as to implement the correct sequence of JDBC API calls to bind all inputs and return all outputs. Certainly not something you’d want to write manually.
I’m still hoping Oracle will improve their support for combining various syntactic features to make the above workarounds obsolete. None of this is necessary in PostgreSQL, which supports almost the same syntax inside of the database.
How this is supported in Db2, H2, standard SQL
The SQL standard has a native syntax for this, and it’s slightly more powerful than PostgreSQL’s, although I’d say quite less readable. It’s called the <data change delta table>
, and it looks something like this:
SELECT id, last_update
FROM FINAL TABLE (
INSERT INTO actor (first_name, last_name)
VALUES ('John', 'Doe')
) a
So, instead of tweaking the INSERT
statement syntax, there’s a NEW TABLE
operator, which takes a DML statement as argument with the following modifiers:
OLD
: returns the data as it was prior to any default / trigger generated values or before theUPDATE
,DELETE
of data, in case that type of statement was executed.NEW
: returns the data as it was after any default or theUPDATE
in case that type of statement was executed, but before trigger generated valuesFINAL
: returns the data as it was actually inserted, i.e. after all trigger generated values, or after theUPDATE
in case that type of statement was executed
You can then further process results from such inserts directly in SQL, with a few limitations (e.g. some joins, set operations, aggregations are forbidden, but you can filter and project).
The most powerful use-case for these keywords is when used with an
UPDATE
statement, as it gives access to data before or after the actualUPDATE
was made.Regrettably, you have to make up your mind about a single keyword. You cannot access all 3 versions of the data in a single statement (see SQL Server’s
OUTPUT
further down, which can do this), e.g. when implementing anUPDATE
audit log.
Note that unlike RETURNING
, this syntax also works with MERGE
Let’s get back to PostgreSQL once more
If you want to do something as powerful as the above, in PostgreSQL, there’s an arcane syntactic feature where you can place RETURNING
statements in a CTE (WITH
clause) as follows:
WITH
a (id, last_update) AS (
INSERT INTO actor (first_name, last_name)
VALUES ('John', 'Doe')
RETURNING id, last_update
)
SELECT *
FROM a;
Curiously, you can’t do the same thing in a derived table. E.g. this won’t work, even if otherwise, CTE and derived tables are largely logically equivalent:
-- Doesn't work:
SELECT *
FROM (
INSERT INTO actor (first_name, last_name)
VALUES ('John', 'Doe')
RETURNING id, last_update
) a (id, last_update);
The other dialects that mimicked PostgreSQL’s syntax don’t support the above, i.e. Firebird, MariaDB, and Oracle PL/SQL.
SQL Server’s OUTPUT clause
SQL Server has an OUTPUT
clause that might be a bit weird, syntactically, but it’s also a bit more powerful even than Db2’s FINAL TABLE
as it allows to access data both from before and after the modifications at the same time.
For this, SQL Server introduces INSERTED
and DELETED
pseudo tables, which contain the data after or before the UPDATE
.
The drawback is that there’s no native way to access trigger generated values in SQL Server, so jOOQ’s emulation is a bit more elaborate,
-- Declare an in-memory table for the results
DECLARE @result TABLE (
id INT,
last_update DATETIME2
);
-- Insert the data and return the results into the in-memory table
INSERT INTO actor (first_name, last_name)
OUTPUT inserted.id, inserted.last_update
INTO @result
VALUES ('John', 'Doe');
-- Merge trigger generated values into the previous results
MERGE INTO @result r
USING (
SELECT actor.id, actor.last_update AS x
FROM actor
) s
ON r.id = s.id
WHEN MATCHED THEN UPDATE SET last_update = s.x;
-- Return the results to the client
SELECT id, last_update
FROM @result;
Again, the actual SQL may differ a bit, but you get the point. This all needs to be done to be able to fetch trigger generated values. Currently, jOOQ’s runtime isn’t aware of tables having triggers, although that might change in the future with #13912.
Note that unlike RETURNING
, this syntax also works with MERGE
Using JDBC to fetch generated keys (Oracle, HSQLDB)
As mentioned above for Oracle, we can also use JDBC to fetch generated keys. In principle, this is how it works:
try (PreparedStatement s = c.prepareStatement(
"INSERT INTO actor (first_name, last_name) VALUES (?, ?)",
new String[] "ID", "LAST_UPDATE"
))
s.setString(1, firstName);
s.setString(2, lastName);
s.executeUpdate();
try (ResultSet rs = s.getGeneratedKeys())
while (rs.next())
System.out.println("ID = " + rs.getInt(1));
System.out.println("LAST_UPDATE = " + rs.getTimestamp(2));
Unfortunately, apart from HSQLDB and Oracle, almost no JDBC driver implements this!
Using JDBC to fetch generated keys (others)
In most other SQL dialects, we have to do something like this:
try (PreparedStatement s = c.prepareStatement(
"INSERT INTO actor (first_name, last_name) VALUES (?, ?)",
Statement.RETURN_GENERATED_KEYS
))
s.setString(1, firstName);
s.setString(2, lastName);
s.executeUpdate();
try (ResultSet rs = s.getGeneratedKeys())
System.out.println("ID = " + rs.getInt(1));
// But there's no way to access LAST_UPDATE here. We'll
// have to run another query
Many JDBC drivers support this in some way, but not all of them support this approach:
- For multiple inserted rows
- For statements other than
INSERT
- For tables without an identity, or where the identity isn’t the primary key
Conclusion
As ever so often, the differences between various SQL vendors are vast, both in terms of:
jOOQ has been hacking JDBC for you, so you don’t have to. With jOOQ, all of the above typically works like this, on all dialects, at least when you’re inserting a single row:
ActorRecord actor
ctx.insertInto(ACTOR, ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
.values("John", "Doe")
.returning()
.fetchOne();