Or does SMT make sense because looking at instructions coming in and branch predicting to execute some speculatively can only go so far, and sometimes hints from the application that "hey, this can be run independently of that" helps with overall throughput?