In a multithreaded world, synchronous code storing its state on the stack, and asynchronous code derived from a CPS transform of direct style (thus storing its state in continuations), seem equivalent for the purposes of "atomicity"; since the OS scheduler can come in and reschedule the synchronous thread at any point (and will definitely context switch if the thread blocks), it seems no different to the CPS transformed case (OS may choose to context switch at any time).
Things like mutex locks do work differently in the CPS case, but it's rather that they are asynchronous too: blocking on a mutex means handing the continuation off to the mutex to continue when it's available to be acquired. And things like thread-local storage work differently - the logical thread of control may change physical threads in the CPS async case, but a runtime or compiler (which will be aware of the CPS transformation) can abstract this away.
I mean that no outside changes will be made to the environment during the execution of the atomic block of code, and that any intermediate state of the environment during the atomic block of code will not be seen by any code outside the block. In this case "the environment" is the process's memory space.
Because I can do so much without having to worry about other threads of execution seeing my internal state, I rarely have to bother with locks at all, even when dealing with globals. I'm not using threads at all; if I need to take advantage of multiple CPUs, I run multiple distinct processes.
What do you mean, "outside changes"? Another thread could come in and make changes inside the same process's memory space - unless you are specifically talking about a single-threaded design.
From the "inside", a thread of execution, and a CPS-transformed chain of continuation calls (originally written in direct style), "feel" exactly the same - precisely because there is a mechanical translation from one to the other with no loss in semantics.
Here's what I suspect you're trying to get at: you're coming from a single-threaded perspective, where Twisted's partially-automated CPS transform is in effect emulating green threads under a cooperative scheduling model, where the "yield" represents yielding control of the current "thread" to potentially other "threads". Your concern over fully automating the transform is because it may introduce further cooperative threading "yield"s that you cannot control.
My point is that under a preemptively threaded environment, any thread could at any time hijack the CPU and mutate your state - and this is no different whether the code is synchronous or CPS-transformed. A "yield" may occur randomly at any point in your code. Hence, the requirement to annotate methods to permit CPS transformation on preemptively multithreaded direct code is a chore; the extra information in the contract doesn't have semantic value, in the same way as it does for the cooperatively-scheduled Twisted approach.
you're coming from a single-threaded perspective, where Twisted's partially-automated CPS transform is in effect emulating green threads under a cooperative scheduling model, where the "yield" represents yielding control of the current "thread" to potentially other "threads". Your concern over fully automating the transform is because it may introduce further cooperative threading "yield"s that you cannot control.
Yes, that's exactly the case. To scale to multiple CPUs, I use distinct processes, not threads. Shared memory and preemptive execution require a lot of programming effort to play nicely together that I don't care to expend.
The GIL in python means that threads generally won't ever let you use more than a single CPU anyway.
In a multithreaded world, synchronous code storing its state on the stack, and asynchronous code derived from a CPS transform of direct style (thus storing its state in continuations), seem equivalent for the purposes of "atomicity"; since the OS scheduler can come in and reschedule the synchronous thread at any point (and will definitely context switch if the thread blocks), it seems no different to the CPS transformed case (OS may choose to context switch at any time).
Things like mutex locks do work differently in the CPS case, but it's rather that they are asynchronous too: blocking on a mutex means handing the continuation off to the mutex to continue when it's available to be acquired. And things like thread-local storage work differently - the logical thread of control may change physical threads in the CPS async case, but a runtime or compiler (which will be aware of the CPS transformation) can abstract this away.
So, what do you mean by "atomic"?