Automatically transform Oracle style implicit joins to ANSI JOIN using jOOQ – Java, SQL, and jOOQ.


While jOOQ is mainly used as an internal SQL DSL for Dynamic SQL embedded in Java, where it offers the best solution on the market, jOOQ is also increasingly used for one of its secondary functions: Its analyzer.

Having been introduced in jOOQ 3.9 primarily for the purpose of being able to parse DDL statements, for example to reverse engineer your schema for code generation purposes, we have added an increasing number of SQL transformation features and capabilities that allow the parser to be used as a stand-alone product, via the command line interface, our website, or via the regular jOOQ API.

One feature that has been added to jOOQ 3.14, which is especially useful for those who use jOOQ as a parser, is the ability to turn old Oracle style implicit joins into ANSI JOIN.

Why avoid “implicit joins”?

The older Oracle style implicit join syntax is supported and well optimized by most RDBMS vendors. In the past, before SQL-92, this is how we used inner join tables, for example when querying the Sakila Database:


SELECT *
FROM actor a, film_actor fa, film f
WHERE a.actor_id = fa.actor_id
AND fa.film_id = f.film_id

Admittedly, the syntax is rather intuitive. Simply declare all the tables that you want to extract data from, and then make sure that the correct data is kept only by filtering out the corresponding primary key / foreign key values.

Of course, it can go very wrong. For the many obvious reasons, such as when you forget a join predicate after adding a table. If the query is complex, it can be difficult to debug. The solution is ANSI JOIN. Starting with SQL-92 (almost 30 years now!), Here’s how we join most RDBMS:


SELECT *
FROM actor a
JOIN film_actor fa ON a.actor_id = fa.actor_id
JOIN film f ON fa.film_id = f.film_id

If it is still possible to define bad join predicates, at least it is no longer possible to forget a predicate, because it is syntactically incorrect (except for MySQL, where, unfortunately, the ON clause is optional):


SELECT *
FROM actor a
JOIN film_actor fa -- Syntax error
JOIN film f -- Syntax error

Implicit JOIN of jOOQ

Note that it is common to refer to the above syntax as “implicit join”, while JPQL and jOOQ have recycled the term for another type of “Implicit join”, which is based on a foreign key path and even less error prone than the ANSI SQL syntax. With jOOQ, the above query can be written as follows:


ctx.select(
      FILM_ACTOR.actor().asterisk(), 
      FILM_ACTOR.asterisk(),
      FILM_ACTOR.film().asterisk())
   .from(FILM_ACTOR)
   .fetch();

The mere presence of these one-to-one relationship paths in the query will implicitly add the LEFT JOIN or INNER JOIN at FROM clause. This is simply a convenience in addition to regular ANSI joins, not a replacement.

Oracle implicit join transformation

When you have an old codebase that you want to upgrade and transform all your queries using ANSI JOIN, use jOOQ for that. You can use the programming capabilities of jOOQ (as mentioned earlier) or the free website https://www.jooq.org/translate.

On the website, just choose the option “Oracle style to ANSI JOIN”, place the following SQL code on the left:

Contribution


SELECT
  a.first_name, 
  a.last_name, 
  count(c.category_id)
FROM
  actor a, 
  film_actor fa, 
  film f,
  film_category fc, 
  category c
WHERE a.actor_id = fa.actor_id
AND fa.film_id = f.film_id
AND fc.category_id = c.category_id
GROUP BY 
  a.actor_id,
  a.first_name,
  a.last_name

Production


SELECT
  a.first_name,
  a.last_name,
  count(c.category_id)
FROM actor a
  JOIN film_actor fa
    ON a.actor_id = fa.actor_id
  JOIN film f
    ON fa.film_id = f.film_id
  CROSS JOIN (
    film_category fc
      JOIN category c
        ON fc.category_id = c.category_id
  )
GROUP BY
  a.actor_id,
  a.first_name,
  a.last_name

And… oops. The output correctly displays the resulting unwanted CROSS JOIN, because one of the join predicates was missing:

Yes, the tool has already helped! Let’s fix the input request:

Fixed entry


SELECT
  a.first_name, 
  a.last_name, 
  count(c.category_id)
FROM
  actor a, 
  film_actor fa, 
  film f,
  film_category fc, 
  category c
WHERE a.actor_id = fa.actor_id
AND fa.film_id = f.film_id
AND f.film_id = fc.film_id -- This was missing
AND fc.category_id = c.category_id
GROUP BY 
  a.actor_id,
  a.first_name,
  a.last_name

Fixed output


SELECT
  a.first_name,
  a.last_name,
  count(c.category_id)
FROM actor a
  JOIN film_actor fa
    ON a.actor_id = fa.actor_id
  JOIN film f
    ON fa.film_id = f.film_id
  JOIN film_category fc
    ON f.film_id = fc.film_id
  JOIN category c
    ON fc.category_id = c.category_id
GROUP BY
  a.actor_id,
  a.first_name,
  a.last_name

This also works if you were using Oracle’s obscure outer join syntax using (+) (or SQL Server *=, which has not been supported for some time now). You might have this entry:

Contribution


SELECT 
  a.first_name, 
  a.last_name, 
  count(c.category_id)
FROM 
  actor a, 
  film_actor fa, 
  film f,
  film_category fc, 
  category c
WHERE a.actor_id = fa.actor_id(+)
AND fa.film_id = f.film_id(+)
AND f.film_id = fc.film_id(+)
AND fc.category_id(+) = c.category_id
GROUP BY 
  a.actor_id,
  a.first_name,
  a.last_name

Produce this output


SELECT
  a.first_name,
  a.last_name,
  count(c.category_id)
FROM actor a
  LEFT OUTER JOIN film_actor fa
    ON a.actor_id = fa.actor_id
  LEFT OUTER JOIN film f
    ON fa.film_id = f.film_id
  LEFT OUTER JOIN (
    film_category fc
      RIGHT OUTER JOIN category c
        ON fc.category_id = c.category_id
  )
    ON f.film_id = fc.film_id
GROUP BY
  a.actor_id,
  a.first_name,
  a.last_name

Uh, wat. Oops again! One of the (+) symbols were on the wrong side, that’s why we got that RIGHT OUTER JOIN. Again, the tool showed that the old syntax was quite error prone. Let’s fix it.

Fixed entry


SELECT 
  a.first_name, 
  a.last_name, 
  count(c.category_id)
FROM 
  actor a, 
  film_actor fa, 
  film f,
  film_category fc, 
  category c
WHERE a.actor_id = fa.actor_id(+)
AND fa.film_id = f.film_id(+)
AND f.film_id = fc.film_id(+)
AND fc.category_id = c.category_id(+)
GROUP BY 
  a.actor_id,
  a.first_name,
  a.last_name

Fixed output


SELECT
  a.first_name,
  a.last_name,
  count(c.category_id)
FROM actor a
  LEFT OUTER JOIN film_actor fa
    ON a.actor_id = fa.actor_id
  LEFT OUTER JOIN film f
    ON fa.film_id = f.film_id
  LEFT OUTER JOIN film_category fc
    ON f.film_id = fc.film_id
  LEFT OUTER JOIN category c
    ON fc.category_id = c.category_id
GROUP BY
  a.actor_id,
  a.first_name,
  a.last_name

Conclusion

Play with it and let us know what you think! https://www.jooq.org/translate



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *