Stop Mapping Stuff in Your Middleware. Use SQL’s XML or JSON Operators Instead

It’s been a while since I’ve ranted on this blog, but I was recently challenged by a reddit thread to write about this topic, so here goes…

So, you’re writing a service that produces some JSON from your database model. What do you need? Let’s see:

  • Read a book on DDD
  • Read another book on DDD
  • Write some entities, DTOs, factories, and factory builders
  • Discuss whether your entities, DTOs, factories, and factory builders should be immutable, and use Lombok, Autovalue, or Immutables to ease the pain of construction of said objects
  • Discuss whether you want to use standard JPA, or Hibernate specific features for your mapping
  • Plug in Jackson, the XML and JSON mapper library, because you’ve read a nice blog post about it
  • Debug 1-2 problems arising from combining Jackson, JAXB, Lombok, and JPA annotations. Minor thing
  • Debug 1-2 N+1 cases

STOP IT

No, seriously. Just stop it right there!

What you needed was this kind of JSON structure, exported form your favourite Sakila database:

 [{ "first_name": "PENELOPE", "last_name": "GUINESS", "categories": [{ "name": "Animation", "films": [{ "title": "ANACONDA CONFESSIONS" }] }, { "name": "Family", "films": [{ "title": "KING EVOLUTION" }, { "title": "SPLASH GUMP" }] }] }, { ... 

In English: We need a list of actors, and the film categories they played in, and grouped in each category, the individual films they played in.

Let me show you how easy this is with SQL Server SQL (all other database dialects can do it these days, I just happen to have a SQL Server example ready:

 -- 1) Produce actors SELECT a.first_name, a.last_name, ( -- 2) Nest categories in each actor SELECT c.name, ( -- 3) Nest films in each category SELECT title FROM film AS f JOIN film_category AS fc ON f.film_id = fc.film_id JOIN film_actor AS fa ON fc.film_id = fa.film_id WHERE fc.category_id = c.category_id AND a.actor_id = fa.actor_id FOR JSON PATH -- 4) Turn into JSON ) AS films FROM category AS c JOIN film_category AS fc ON c.category_id = fc.category_id JOIN film_actor AS fa ON fc.film_id = fa.film_id WHERE fa.actor_id = a.actor_id GROUP BY c.category_id, c.name FOR JSON PATH -- 4) Turn into JSON ) AS categories FROM actor AS a FOR JSON PATH, ROOT ('actors') -- 4) Turn into JSON 

That’s it. That’s all there is to it. Only basic SQL-92, enhanced with some vendor-specific JSON export syntax. (There are also SQL standard JSON APIs as implemented in other RDBMS). Let’s discuss it quickly:

  1. The outer most query produces a set of actors. As you would have expected
  2. For each actor, a correlated subquery produces a nested JSON array of categories
  3. For each category, another correlated subquery finds all the films per actor and category
  4. Finally, turn all the result structures into JSON

That’s it.

Want to change the result structure? Super easy. Just modify the query accordingly. No need to modify:

  • Whatever you thought your DDD “root aggregate was”
  • Your gazillion entities, DTOs, factories, and factory builders
  • Your gazillion Lombok, Autovalue, or Immutables annotations
  • Your hacks and workarounds to get this stuff through your standard JPA, or Hibernate specific features for your mapping
  • Your gazilion Jackson, the XML and JSON mapper library annotations
  • Debugging another 1-2 problems arising from combining Jackson, JAXB, Lombok, and JPA annotations
  • Debugging another 1-2 N+1 cases

No! No need! It’s so simple. Just stream the JSON directly from the database to the client using whatever SQL API of your preference: JDBC, jOOQ, JdbcTemplate, MyBatis, or even JPA native query. Just don’t go mapping that stuff in the middleware if you’re not consuming it in the middleware. Let me repeat that for emphasis:

Don’t go mapping that stuff in the middleware if you’re not consuming it in the middleware.

Oh, want to switch to XML? Easy. In SQL Server, this amounts to almost nothing but replacing JSON by XML:

 SELECT a.first_name, a.last_name, ( SELECT c.name, ( SELECT title FROM film AS f JOIN film_category AS fc ON f.film_id = fc.film_id JOIN film_actor AS fa ON fc.film_id = fa.film_id WHERE fc.category_id = c.category_id AND a.actor_id = fa.actor_id FOR XML PATH ('film'), TYPE ) AS films FROM category AS c JOIN film_category AS fc ON c.category_id = fc.category_id JOIN film_actor AS fa ON fc.film_id = fa.film_id WHERE fa.actor_id = a.actor_id GROUP BY c.category_id, c.name FOR XML PATH ('category'), TYPE ) AS categories FROM actor AS a FOR XML PATH ('actor'), ROOT ('actors') 

And now, you’re getting:

 <actors> <actor> <first_name>PENELOPE</first_name> <last_name>GUINESS</last_name> <categories> <category> <name>Animation</name> <films> <film> <title>ANACONDA CONFESSIONS</title> </film> </films> </category> <category> <name>Family</name> <films> <film> <title>KING EVOLUTION</title> </film> <film> <title>SPLASH GUMP</title> </film> </films> </category> ... 

It’s so easy with SQL!

Want to support both without rewriting too much logic? Produce XML and use XSLT to automatically generate the JSON. Whatever.

FAQ, Q&A

But my favourite Java SQL API can’t handle it

So what. Write a view and query that instead.

But this doesn’t fit our architecture

Then fix the architecture

But SQL is bad

No, it’s great. It’s based on relational algebra and augments it in many many useful ways. It’s a declarative 4GL, the optimiser produces way better execution plans than you could ever imagine (see my talk), and it’s way more fun than your gazillion 3GL mapping libraries.

But what about testing

Just spin up a test database with https://www.testcontainers.org, install your schema with some migration framework like Flyway or Liquibase in it, fill in some sample data, and write your simple integration tests.

But mocking is better

It is not. The more you mock away the database, the more you’re writing your own database.

But I’m paid by the lines of code

Well, good riddance, then.

But SQL is evil because of Oracle

Then use PostgreSQL. It can do JSON.

But what if we have to change the RDBMS

So what? Your management paid tens of millions for the new licensing. They can pay you tens of hundreds to spend 20 minutes rewriting your 5-10 SQL queries. You already wrote the integration tests above.

What was that talk of yours again?

Here, highly recommended:

[embedded content]

But we’ve already spent so many person years implementing our middleware

It has a name

But I’ve read this blog post

And now you’ve read mine.

But that’s like 90s style 2 tier architecture

So what? You’ve spent 5% the time to implement it. That’s 95% more time adding value to your customers, rather than bikeshedding mapping technology. I call that a feature.

Like
0
Love
0
Ah Ah
0
Wow
0
Sad
0
Grrr
0
Short URL: http://tinyurl.com/wnysg4e