Available in versions: Dev (3.18) | Latest (3.17) | 3.16 | 3.15 | 3.14 | 3.13 | 3.12 | 3.11 | 3.10 | 3.9 | 3.8

SQL: N+1

Applies to ✅ Open Source Edition   ✅ Express Edition   ✅ Professional Edition   ✅ Enterprise Edition

"Hooray, we're using jOOQ (i.e. SQL), so we got rid of our N+1 problems."

Well, there's a class of N+1 problems that jOOQ won't ever run into. It's those "accidental" N+1 queries that happen because of lazy loading. With jOOQ everything is always loaded "eagerly", exactly as you specify it in your queries. Unlike in ORMs, eager loading isn't automatic either. jOOQ doesn't just materialise large parts of your object graph on its own. Everything is done explicitly. But that means you can still run into explicit N+1 problems. For example:

// 1 query
for (Integer id : create
    .select(AUTHOR.ID)
    .from(AUTHOR)
    .fetch(AUTHOR.ID)
) {

    // N queries
    List<Integer> books =
    create.select(BOOK.ID)
          .from(BOOK)
          .where(BOOK.AUTHOR_ID.eq(AUTHOR.ID))
          .fetch(BOOK.ID);
}

The N+1 problem (technically, it should have been named 1+N problem) can be seen easily above:

  • 1 query is executed to fetch all AUTHOR records.
  • N queries are executed to fetch all BOOK records per AUTHOR.

This particular query would much better be implemented with a simple SQL JOIN:

// 1 query
for (Record2<Integer, Integer> record : create
    .select(AUTHOR.ID)
    .from(AUTHOR)
    .join(BOOK)
    .on(BOOK.AUTHOR_ID.eq(AUTHOR.ID))
) {

    // No additional queries needed...
}

You can use one of jOOQ's many mapping capabilities to nest your collections directly in your Java logic, or use MULTISET or MULTISET_AGG to nest the collection directly in SQL, as long as you just don't run several roundtrips to the database.

N+1 isn't a problem that is strictly related to SQL. It just got popularised by ORMs that emphasise automatic population of object graphs via lazy loading, rather than emphasising querying. The underlying problem, however, is latency between a client (your Java code) and a server (your RDBMS), which is caused by too many round trips. See this blog post for an example that compares calling a stored procedure 1 time to fetch N items, vs. calling another stored procedure N times to fetch 1 item, each time: https://blog.jooq.org/the-cost-of-jdbc-server-roundtrips/. The result is the same.

But with SQL, things get even worse. If you're pushing the entire declarative query into the database, the database has freedom to choose between various eligible algorithms to produce the result (e.g. hash join vs nested loop join, etc.). If you loop over your N parent rows yourself, you're enforcing a nested loop join, which can be worse in addition to the extra latency, in case a hash join or merge join would have been better.

Feedback

Do you have any feedback about this page? We'd love to hear it!

The jOOQ Logo