Idempotence doesn't solve this problem. The jobs are all idempotent. The problem...

_sojh · on April 14, 2023

Or install an opensource gem[1] that recreates the functionality using the same redis rpoplpush[2] command

[1] https://gitlab.com/gitlab-org/ruby/gems/sidekiq-reliable-fet...

[2] https://redis.io/commands/rpoplpush/#pattern-reliable-queue

aqme28 · on April 14, 2023

Fair enough about idempotence.

I'm still confused about what you're saying though. You're saying that the language of "enhanced reliability" doesn't reflect losing 2 jobs over about 50*7 million (from your other comment)?

And that if you didn't pay for the service, you'd have to add some checks to make up for this?

That all seems incredibly reasonable to me.

danenania · on April 14, 2023

Crashes are under your control though. They’re not caused by sidekiq. And you could always add your own crash recovery logic, as you say. To me that makes it a reasonable candidate for a pro feature.

It’s hard to get this right though. No matter where the line gets drawn, free users will complain that they don’t get everything for free.

Mavvie · on April 14, 2023

How are crashes under your control? Again they aren't talking about uncaught exceptions, but crashes. So maybe the server gets unplugged, the network disconnects, etc.

danenania · on April 14, 2023

To me 'crash' means any unexpected termination, whether it's caused by an uncaught exception, OOM, or hardware/network issues.

I guess you can say that hardware issues on your host aren't under your control, but it's under your control to find a host that doesn't have these issues. And not even a full-on ACID database is going to be 100% reliable if you yank the power cord at the wrong moment.

Mavvie · on April 14, 2023

I hope my tone doesn't come across as rude or too argumentative, but I think your understanding is a bit inaccurate.

> it's under your control to find a host that doesn't have these issues

All hosts will have these issues, the only question is how often. If you need 100% consistency, then you can't use the free Sidekiq. Personally, I've never needed Sidekiq pro (as these kinds of crashes are extremely rare). But this will depend on your scale and use case.

> And not even a full-on ACID database is going to be 100% reliable if you yank the power cord at the wrong moment

This is only true if there's bugs in the DB, or some underlying disk corruption happens. The whole point of an ACID database is that they're atomic, durable, and consistent, even in the worst case scenario. If a power failure corrupted my SQL database I would feel very betrayed by the database.

danenania · on April 14, 2023

It wouldn’t be corrupted, but in-flight transactions could fail to commit, just like queued jobs can be lost with sidekiq. The failure modes are similar.

I take your point that at a certain scale, hardware failure is inevitable, but if you’re running that many servers, you can afford sidekiq’s enterprise plan. It’s not something that will realistically happen if you’re just running like 20 instances on AWS. It’s perfectly reasonable to charge extra for something only large organizations with huge infrastructure budgets need.

Mavvie · on April 14, 2023

For sure, I agree with you.

I would say that queued jobs being lost is different from an in-flight transaction being auto-rolled-back, but it's not a super important distinction. Like others have said, I think Sidekiq really nailed the free vs premium features and its success is evidence of that.