He's precisely considering your 16-core case. Now think about what happens when your dataset grows and you need more cores than you can reasonably fit in a single machine.
It's not when, it's if, and often you know it won't happen. Being a good engineer requires understanding when k use which model. acting as if there aren't a lot of cases where singel process, shred memory, concurrency is the only good choice for almost all cases is wrong
He's precisely considering your 16-core case. Now think about what happens when your dataset grows and you need more cores than you can reasonably fit in a single machine.