You're right that it exists, but it's complete crap outside a quiet environment. Try to use it while walking around outside or in any semi-noisy area and it fails horribly (iPhone 13, so YMMV if you have a newer one).
You cannot use an iPhone as a dictation device without reviewing the transcribed text, which IMO defeats the purpose of dictation.
Meanwhile, i've gotten excellent results on the iPhone from a Whipser->LLM pipeline.
I've never found real-time dictation software that doesn't need to be reviewed.
I'm definitely waiting for Apple to upgrade their dictation software to the next generation -- I have my own annoyances with it -- but I haven't found anything else that works way better, in real time, on a phone, that runs in the background (like as part of the keyboard).
You talk about Whisper but that doesn't even work in real time, much less when you have to run it through an LLM.
What's the real-time requirement for? We may have different use cases, but it's not needed if I don't need to review the results. Speak -> Send, without reviewing the text, is the desired workflow. I.e. so you can compose messages without looking at your phone.
So yes, i'm not sure of alternate real-time solutions, but the non real-time solution of Whisper is much better for my real-world use case.