This is really everybody else's work. All I did was some mentoring of Emanuele, who is also the first presenter in the linked video.
We didn't use formal methods, but some small parts of the async runtime were validated with spin. The Promela sources are in the QEMU source repository.
In the end, the locking changes are relatively low tech. The verification part are where the magic happens, and TSA plus my call graph analysis tool vrc are enough for that.
Question about TSA: did you consider combining it with a formal specification of the locking model e.g. with Alloy? Or other formal method.