“When all you have is a hammer, every problem looks like a nail.”
When all you know is SQL Server, it’s tempting to try and solve every problem with a relational database, but often it’s not the best tool for the job. Today I want to introduce you to an excellent example of this, the ‘Database As Queue’ anti-pattern. It’s a common specimen that you’ll find in the wild in many enterprise development shops.
It works like this: Usually there’s a piece of data that needs to be processed through some workflow, often by a number of different applications or processes.
There’s usually a table with some data and a status column. Each process is designed to process data at a particular status. Process 1 in this example inserts records with the status ‘New’, Process 2 picks up ‘New’ records and updates the status to ‘P2 Complete’, Process 3 picks up ‘P2 Complete’ records… you get the picture.
When I say ‘picks up’, what I mean is that each process sits there polling the database at some interval running a select statement something like this:
SELECT * FROM MyWorkflow WHERE Status = 'New'
Why is it an anti-pattern?
Firstly, there’s the polling. Either you have a short interval and hammer your database into submission, or you have a long interval and even the fastest route through your system is the sum of all the (long) intervals.
Secondly there’s the fact that SQL Server is very efficient at insertions, or updates, or queries, but rarely all three on the same table. To make the polling queries perform, you need to put an index on the status, but indexes make for slow inserts. Locks can become a major problem.
Thirdly there’s the issue of how you clear records once the workflow is complete. If you don’t clear the table down at intervals, it will keep growing until it starts to hit the performances of your queries and eventually become a serious operations problem. The problem is, deletes are inefficient on SQL Server.
Fourthly, sharing a database between applications (or services) is a bad thing. It’s just too tempting to put amorphous shared state in there and before you know it you’ll have a hugely coupled monster.
So what should I do?
Simple, use the right tool for the job: this scenario is crying out for a messaging system. It solves all the problems described above; no more polling, efficient message delivery, no need to clear completed messages from queues, and no shared state.