Text or voice. For voice you need another model. But I watched the demos and I can't wait for my invite. It's even better than GPT3 because this time there is a direct application of the model.
I was surprised about how OpenAI sees it: a model learning code as recipes for solving problems. Code is much more exact than natural language, the mix of both is the main advantage.
I think voice would have a too high error rate, as you are multiplying voice recognition error rate * codex error rate. However, codex/gpt3 could generate intents and that would be quite cool.