Aren't you just saying a single threaded human voice interface is better than machine to machine and HCI to solve this one problem? You have at least 2 FA or 3FA with the phone call: your callerID # looks real, you respond appropriately, your voice sounds real too. Everyone wants one basic item from the pizza shop, so little interaction is needed. Sure, you could be a prank caller, but you both implicitly agree to conditionally trust each other.