Hacker News new | past | comments | ask | show | jobs | submit login

it's an order-dependent logic error caused by lack of synchronization

what would you call it if not a race?




If the order is deterministic, there is nothing "order-dependent", other than in the sense that every calculation is order-dependent, in that any calculation might give you different results if you changed the current deterministic order of operations to a different order of operations.

For the same reason, there is also no lack of synchronization. If the result is wrong, it's simply a wrong calculation/a logic error, and if that is because the order of operations is wrong, then that is because the order of operations is wrong, not because of some race that doesn't happen.


If you have a Hyperthreading CPU, the order of operations on a given code block can change from sequential to parallel depending on the availability of specific ALUs. Many times the code can be correct until something external either causes more or less hyperthreading to occur which exposes dormant bugs.


... in which case the ordering of operations is not deterministic, so it's not relevant to this thread of the discussion?


in my example I'm launching two compare-and-swap operations (op1 and op2) at the same time.

The correct order (in the presence of synchronization) would be [op1_read, op1_write, op2_read, op2_write].

The incorrect order (without synchronization) is [op1_read, op2_read, op1_write, op2_write].

In the second trace, op2_write will overwrite op1_write with the data from its stale read. The correctness issue here isn't dependent on nondeterminism, just on lack of synchronization.


> In the second trace, op2_write will overwrite op1_write with the data from its stale read. The correctness issue here isn't dependent on nondeterminism, just on lack of synchronization.

No, it's simply a wrong order. When the order of operations is deterministic, that means, by definition, that they are synchronized. The fact that you can change the order by inserting or replacing instructions does not mean that the code lacked synchronization, and it is irrelevant that you could use those same instructions in a different context to synchronize operations. In this context, operations are already synchronized, so nothing you could do can possibly "sychronize them more", you only can reorder them (or desynchronize them, potentially, by introducing non-determinism in the order).

Synchronization is what you use to restrict the ordering of operations. When there is only one possible ordering of operations (i.e., the order is deterministic), there is nothing to restrict there. When the order created by some synchronization is the wrong order for correctness of the software, that doesn't mean that the code is lacking synchronization, it simply means that it does order operations wrong.


simple case of this is an increment operation

the programmer has written an incorrect function that does a SQL read, parses a value and then does an update. It doesn't use a transaction (i.e. no synchronization) so this is unsafe to use in parallel.

and someone else uses their function without reading carefully, spinning off two invocations without reading carefully and awaiting them both

I suspect you'll agree this is incorrect in that the value will be incremented once instead of twice.

If you don't call this a race condition, what do you call it?


Your original scenario was one with deterministic execution order, this one as far as I can tell is not ... so, how is this example relevant to the discussion?


if you make some assumptions about how long IO takes and how the systems order IO, this overlapping-increment example will have a deterministic order and still be wrong every time (increment once instead of twice).

The assumptions are that op1 and op2 are started in order and that your database replies in the order received.


> if you make some assumptions about how long IO takes and how the systems order IO, this overlapping-increment example will have a deterministic order and still be wrong every time (increment once instead of twice).

So? Is every piece of code that gives the wrong result a race condition?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: