This video[1] seems to give some insight into what the process actually is, which I believe is also indicated by the output token cost.
Whereas GPT-4o spits out the first answer that comes to mind, o1 appears to follow a process closer to coming up with an answer, checking whether it meets the requirements and then revising it. The process of saying to an LLM "are you sure that's right? it looks wrong" and it coming back with "oh yes, of course, here's the right answer" is pretty familiar to most regular users, so seeing it baked into a model is great (and obviously more reflective of self-correcting human thought)
So it's like the coding agent of gpt4. But instead of actually running the script and fix if it gets error, this one check with something similar to "are you sure". Thank for the link.
Whereas GPT-4o spits out the first answer that comes to mind, o1 appears to follow a process closer to coming up with an answer, checking whether it meets the requirements and then revising it. The process of saying to an LLM "are you sure that's right? it looks wrong" and it coming back with "oh yes, of course, here's the right answer" is pretty familiar to most regular users, so seeing it baked into a model is great (and obviously more reflective of self-correcting human thought)
[1] https://vimeo.com/1008704043