\* Disclosure: I sometimes contribute to NSQ. We use NSQ at Dell for the commerc...

* Disclosure: I sometimes contribute to NSQ.

We use NSQ at Dell for the commercial side of dell.com. We've been in Production with it for about 2 years.

> what are the typical use cases for NSQ ?

In the abstract, anything which can tolerate near real-time, at-least-once delivery, and does not need order guarantees. It also features retries and manual requeuing. It's typical to think order and exactly-once semantics are important because that's how we tend to think when we write code and work with (most) databases, and having order allows you to make more assumptions and simplify your approach. It typically comes at the cost of coordination or a bounded window of guarantees. Depending on your workload or how you frame the problem you may find order and exactly-once semantics are not that important, or it can be made unimportant (for example, making messages idempotent). In other cases order is important and it's worth the tradeoff; our Data Science team uses Kafka for these cases, but I'm not familiar with the details.

Here are some concrete examples of things we built using NSQ, roughly in the order they were deployed to PROD:

- Batch jobs which query services and databases to transform and store denormalized data. We process tens of millions of messages in a relatively short amount of time overnight. The queue is never the bottleneck; it's either our own code, services, or reading/writing to the database. Retries are surprisingly useful in this scenario.

- Eventing from other applications to notify a pinpoint refresh is needed for some data into the denormalized store (for example, a user updated a setting in their store, which causes a JSON model to update).

- Purchase order message queue, both for the purpose of retry and simulating what would happen if a customer on a legacy version of the backend was migrated to the new backend; also verifying a set of known 'good' orders continue to be good as business logic evolves (regression testing).

- Async invoice/email generation. This is a case where you have to be careful of at-least-once delivery and need to use a correlation ID and persistence layer to define a 'point of no return' (can't process the message again beyond this point even if it fails). We don't want to email (or bill) customers twice.

- Build system for distributing requests to our build farm.

- Pre-fetching data and hydrating a cache when a user logs in or browses certain pages, anticipating the likely next page to avoid having the user wait on these pages for an expensive service call. The client in this case is another decoupled web application; the application emitting the event is completely separate and likely on a different deployment schedule from the emitting application. The event emitted tells us what the user did, and it's the consumer's responsibility to determine what to do. This is an interesting case where we use #ephemeral channels, which disappear when the last client disconnects. We append the application's version to the channel name so multiple running versions in the same environment will each get their own copy of the message, and process it according to that binary's logic. This is useful for blue/green/canary testing and also when we're mid-deployment and have different versions running in PROD, one customer facing and one internal still being tested. I think I refer to this image more than any other when explaining NSQ's topics and channels: https://f.cloud.github.com/assets/187441/1700696/f1434dc8-60... (from http://nsq.io/overview/design.html).

Operationally, NSQ has been not just a pleasure to work with but inspirational to how we develop our own systems. Being operator friendly cannot be overrated.

Last thing, if you do monitoring with Prometheus I recommend https://github.com/lovoo/nsq_exporter.