Author here. I use queues all the time. I program in Erlang and there is one unbounded queue in every single process' mailbox in every node.
The problem is not queues themselves, it's when queues are used to 'optimize' an application (usually the front-end) by removing the synchronous aspect of things. In some cases it's reasonable, say long running tasks like video processing or whatever could be handled in many other ways than blocking up the front-end, for friendlier results.
I've seen queues applied in plenty of times where their use case will be to handle temporary overload (by raising latency and pushing some work to be done later, when the system slows down). The problem is that as time goes and that temporary overload becomes more and more permanent (or has such a magnitude as you can't catch up on it in your slow hours), it suddenly starts exploding very violently.
How much time the queue buys you is unknown, because the people running in these issues have likely not been able to characterize the load on their system, or the peak load they'd have (because the system dies before then). It's a challenging problem. Ideally you fix it by putting boundaries on your system (even on your queue! that's good!) and expect to lose data eventually or block it.
The really really dangerous thing is using a queue and never expecting you will lose data. That's when things get to be very wrong because these assumptions lead to bad system design (end-to-end principle and idempotence are often ignored).
And then people go on to blame queues as the problem, when the real issue was that they used a queue to solve overload without applying proper resource management to them.
Most of what I rant about in that blog post is stuff I've seen happen and even participated in. I don't want that anymore.
I understand where you are coming from and I'd even recommend your article to people considering using queuing. I apologize if made it sound like I didn't like or understand the article. It just felt a bit heavy on dissuading people from considering using queuing, however, reading through it again I can see that the intent is to describe why queues should not be used in a very particular context.
It would be nice to see a similar write-up on appropriate use of queuing, with more plumbing visuals.
> How much time the queue buys you is unknown, because the people running in these issues have likely not been able to characterize the load on their system, or the peak load they'd have (because the system dies before then). It's a challenging problem.
Is this a case for using queueing theory to model your system's behaviour?
Even if you don't go down that path (and there are tools for it[0]), using queues to connect components helps find bottlenecks in one very simple way: you can look for where the queues are growing. As a rule of thumb, the longest queue is going to be in front of the bottleneck.
The longest queue is going to be in front of the tightest bottleneck, but not the central bottleneck critical to your application.
If you find you have a bottleneck halfway through the pipe and you make that bottleneck go away, your true real bottleneck down in the wall still remains there, and all your optimizations done above that one will end up just opening the floodgates wider.
Once you've protected that kind of more at-risk core, then it's easier and safer to measure everything that sits above it in the stack, and it puts an upper bound on how much you need to optimize (you stop when you hit your rate limits on the core component).
It seems like you're really talking more about the danger of unbounded queues (and the virtue of bounded queues). A synchronous process is effectively just a bounded queue with a maximum capacity of 1.
The problem is not queues themselves, it's when queues are used to 'optimize' an application (usually the front-end) by removing the synchronous aspect of things. In some cases it's reasonable, say long running tasks like video processing or whatever could be handled in many other ways than blocking up the front-end, for friendlier results.
I've seen queues applied in plenty of times where their use case will be to handle temporary overload (by raising latency and pushing some work to be done later, when the system slows down). The problem is that as time goes and that temporary overload becomes more and more permanent (or has such a magnitude as you can't catch up on it in your slow hours), it suddenly starts exploding very violently.
How much time the queue buys you is unknown, because the people running in these issues have likely not been able to characterize the load on their system, or the peak load they'd have (because the system dies before then). It's a challenging problem. Ideally you fix it by putting boundaries on your system (even on your queue! that's good!) and expect to lose data eventually or block it.
The really really dangerous thing is using a queue and never expecting you will lose data. That's when things get to be very wrong because these assumptions lead to bad system design (end-to-end principle and idempotence are often ignored).
And then people go on to blame queues as the problem, when the real issue was that they used a queue to solve overload without applying proper resource management to them.
Most of what I rant about in that blog post is stuff I've seen happen and even participated in. I don't want that anymore.