My interpretation of the parent post is not that LLMs' output should be checked by humans, or that they are used in domains where physical verification is expensive; no, what they're suggesting is using a secondary non-stochastic AI system/verification solution to check the LLM's results and act as a source of truth.
An example that exists today would be the combination of ChatGPT and Wolfram [1], in which ChatGPT can provide the method and Wolfram can provide the execution. This approach can be used with other systems for other domains, and we've only just started scratching the surface.
Yes, your interpretation is correct. I think the killer app here is mathematical proof. You often need intuition and creativity to come up with a proof, and I expect AI to become really good at that. Checking the proof then is completely reliable, and can be done by machine as well.
Once we have AI's running around with the creativity of artists, and the precision of logicians, ... Well, time to read some Iain M. Banks novels.
An example that exists today would be the combination of ChatGPT and Wolfram [1], in which ChatGPT can provide the method and Wolfram can provide the execution. This approach can be used with other systems for other domains, and we've only just started scratching the surface.
[1] https://www.wolfram.com/wolfram-plugin-chatgpt/