Hacker News new | past | comments | ask | show | jobs | submit login

Random 64 bit ids are used in the process. So the probability of accidentally completing a tuple is very, very small (1 / 2^64 for every ack).

Counters don't work because of the asynchronous nature of Storm. For example, consider a topology that looks like this:

A -> B -> C

      \-> D
Let's say A emits 2 tuples (+2 differential), B processed those and emits 2 to C and 3 to D (+3 differential), C processes 2 tuples (-2 differential), and D processes 3 tuples (-3 differential).

Everything's asynchronous, so the acker could receive the acks in this order: A, C, B, D

The counter would then look like this: 2, 0, 3, 0

So it would think that the tuple was complete before it actually was, which means the counter algorithm doesn't work.




"So the probability of accidentally completing a tuple is very, very small (1 / 2^64 for every ack)."

Which means, due to the birthday paradox, that you should expect your first collision around 2^32 tuples. Process a few million tuples a day (that's only about 25 per second) and you should expect your first error within your first year.

EDIT: I'm wrong, see the explanations below.


The birthday paradox doesn't apply. All that matters it the value at the time the ack is applied, so it's always 1/2^64 (because the xor of any number of random numbers is still random).


No, the birthday paradox very much applies. The chance of success for your first ack is (2^64-1)/2^64. The chance of success for your first and second ack is the chance of success for your first ack times the chance of success for your second ack, ((2^64-1)/2^64)^2. And so on for your third ack, and your ack.

By the time you reach your 2^32 ack, your chance of having all successes is ((2^64-1)/(2^64))^2^32, which is < 0.5.

EDIT: My test program seems to indicate that I'm wrong, but I can't see the flaw either in it or in my reasoning. Can you explain why the birthday paradox doesn't apply here?

EDIT': In the birthday paradox, it would be ((2^64-1)/2^64)) * ((2^64-2)/2^64) * ((2^64-3)/2^64), etc.


The birthday paradox concerns the chance of 2 random elements in an entire set being the same. There's no set here, so there's no birthday paradox. http://en.wikipedia.org/wiki/Birthday_problem

Hilariously, you got your equation right (which isn't the birthday paradox equation btw) but got your result wrong. http://www.wolframalpha.com/input/?i=%28%282%5E64-1%29%2F%28...

So the chance of success after 2^32 acks is >0.999999999, not <0.5.

It takes about 2^60 acks before there's a significant chance of a mistake. That's a lot of acks, so it will take an insanely long time for a mistake to be made.


The birthday paradox kicks in when you have a set of objects, and a collision between any pair is interesting. Here, only a collision with 0 is interesting.

Suppose that the sequence of values you have after each ack is A, B, C, D, E (five acks total). So long as A, B, C, and D are all nonzero, we're OK. With the birthday paradox, we'd be looking at A=B, A=C, A=D, A=E, B=C, B=D, etc. -- many more combinations.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: