I feel like, while it may be useful to distinguish the terms, a data race is rea...

czrnb · on July 22, 2019

> I'd say data races are race conditions, but not all race conditions are data races.

In practice, I think you are almost right. There are some cases which exhibit data races, while being free of race conditions, but I think the linked post overstates the importance of differentiating between data races leading to and free of race conditions.

> That's really multiple operations > [...] > if you deconstruct the actual operations performed by a single line of code as I did here

It is not a given that the increment operation is to be divided into individual operations.

All of this is very dependent on the language (interpreted vs compiled (and how it's compiled)), how the concurrency is implemented (managed by the kernel vs software threads), as well as the ordering model of the hardware (whether the processor takes the liberty of reordering instructions).

The data races you describe lead to a race condition under the assumption that `a` or a derivative is used in some conditional branching later on. Arguably, this is always the case, otherwise, there would be no point in having this variable in the first place. But this is just one example.

Now for an example of a data race without race conditions which, IMO, is a bit more explicit than the one from the linked post. Imagine the following: we want each of our threads to write a timestamp of its execution into a shared variable, in order to know when the last thread was executed. This situation contains a data race by definition, as we do not know which thread will write its value into the shared variable. However, both values are similar enough for our monitoring needs (e.g. correct), and do not lead to race conditions.

didibus · on July 22, 2019

Hum... okay. Maybe I'm not understanding what is meant by a data race here.

So your example would be:

  function f() {
    lastTime = Time.now();
  }

And now if we had threads T1 and T2 calling f concurrently, you say this would be a data race?

So I'm confused here. There's no bug, so a data race is not a category of defect the way race conditions are? Both your example and the article's example was a data race that is not a bug. Is there an example of a data race that is not a race condition yet leads to a bug?

czrnb · on July 29, 2019

By the article's definition, yes, it would be a data race. With this example, I just wanted to make it more explicit to you what a data race without a race condition could look like.

Truly I'm still not sure why the author had to make a point and write a blog post about the conceptual difference between data races and race conditions.

javert · on July 22, 2019

> I'd say data races are race conditions, but not all race conditions are data races.

This is not the case according to the article. In the article, transfer4 is a data race but not a race condition.

Your example with T1 and T2 is a race condition and a data race.

Your example with a = a + 1 is neither a race condition nor a data race since there is no concurrency.

It's probably possible to argue that the terms should be defined differently to match the formulation you have instead of the author's. After all, definitions are created by people to help our thinking and we can change them or refine them. But I suspect that would be very a difficult argument to make in a compelling way.

didibus · on July 22, 2019

Maybe I didn't understand the article properly then. Their example of a data race is

account.activity = true;

Which two threads could perform concurrently.

Now from my understanding, the issue is that the hardware, OS, runtime and compiler might not guarantee that this assignment is atomic. Thus if two threads were to perform it concurrently, the result would be undefined, in that maybe the value ends up being true, or maybe it ends up corrupted, or maybe it is still false, or it will eventually be true, but for a while, if you read back the value it would still be false, etc.

Edit: Also I don't think you understood my example. When I lay out T1 and T2, I'm talking about two threads concurrently executing a = a + 1;. Which is a data race, by their definition. But I'm saying that the issue with data races are that they can lead to race conditions at the assembly level. Thus a data race is really just a race condition at the assembly or even hardware level.

jacinabox · on July 22, 2019

I'm starting to be convinced that data races don't matter if the variable being raced is in the machine word size.

bshanks · on July 25, 2019

It depends on context. It's true that many CPUs provide atomicity for aligned accesses of the machine word size. However, the C/C++11 standard defines data races to be undefined behavior, so a data race in a C program still matters.