Apparently OpenAI has some excellent developer relations and marketing people too. Is this guy even a programmer at all? His bio says "WSJ best selling novelist, Edgar & Thriller Award finalist, star of Shark Week, A&E Don’t Trust Andrew Mayne, creative applications and Science Communicator at OpenAI." so maybe not? This blog seems to have useful OpenAI related information, it's odd that it's on this guy's personal blog instead of the OpenAI website.
This morning I feel oddly compelled to play the fool so here are some near/medium term thoughts on where this may be going (worth less than what you paid for them):
1. The most important ChatGPT plugin is going to end up being the one that invokes itself recursively. The autoregression approach seems to be severely limiting what these models can do by limiting their ability to think without speaking. Although a few months ago I thought the obvious way to fix this was to train the model to emit special sort of "bracket" tokens that would be deleted by the driver once the inner thought completed, leaving only a sort of "result" section, the GPT-as-a-GPT-plugin effectively does the same thing
2. Whilst the first biggest win from the plugin will be "sub-thoughts", the next biggest will be training it how to dispatch multiple sub-thoughts in parallel. GPT already knows how to break a complex problem down into steps, but is still constrained by context window size and inference speed. Once it is taught how to split a problem up such that multiple independent inference sessions are able to work on it in parallel, it'll become feasible to make requests like "Build me a video game from scratch using Unreal Engine, set in the world of Harry Potter, about the adventures of a character named X" etc and it'll end up dispatching a massive tree of GPT sub-instances which end up working on the independent parts like character generation, writing the Unreal C++, prompting Midjourney and so on.
Parallel recursive LLMs are going to be much more awesome than current LLMs, and I mean that in both senses of the word (cool, awe-inspiring). In particular, this will allow us to pose questions like "How can we cure cancer?".
3. OpenAI need a desktop app, pronto. Whilst the cloud model can take you some way, the most valuable data is locked behind authentication screens. The cloud approach faces difficult institutional barriers, because data access inside organizations is oriented around granting permissions to individuals even when they work in teams. Giving a superbrain superuser access doesn't fit well with that, because there's no robust method to stop the AI immediately blabbing business secrets or PII to whoever tickles it in the right way. That's one reason why the current wave of AI startups is focused on open source technical docs and things. If ChatGPT is given tool access via a desktop app running on the end user's computer, it can access data using the same authentication tokens issued to individuals. This also neatly solves the question of who is accountability for mistakes: it's the user who runs the app.
For some reason I've never seen the idea of auto-recursive prompting in any of the papers or discussions. It makes so much sense. It can also help with model and compute size. Instead of using this large model to, say, list the number of primes less than a 1000, it can prompt GPT-3 to do it and count them, then send it back to GPT-4. Sounds quite feasible to implement too!
Original author here. I'm a programmer. I started on the Applied team at OpenAI back in 2020 as a prompt engineer (I helped create many of the examples in the GPT-3 docs.) I became the Science Communicator for OpenAI in 2021.
My blog audience is very non-technical so I write very broadly. We've been super busy with the launch of GPT-4 and Plugins (I produced the video content, found examples, briefed media on technical details, etc.) so I was only able to grab a few hours to put these demos together.
As far as the ChatGPT prompts go, I included a few, but they're just simple instructions. Unlike GPT 3.5 where I'd spend an hour or more getting the right instruction to do zero-shot app creation, GPT-4 just gets it.
Wow, you learned programming specifically to work with AI? That is an inspiring level of flexibility in skills and self-identification. Perhaps many of us will need to learn how to do that sort of reinvention sooner, rather than later.
Not desktop at all. I’m focused on it operating it’s own computing resources using the recursive approach. I call it multi-agent LLM approach. This way it can breakdown a complex task into components and attack each component in parallel or sequentially as it needs.
I’m not a researcher at all but a partitioner with extensive quantitative development experience in an applied industry situation using ML tools.
I’ve been thinking that taking this up a level is more a systems architecture problem. The core LLM model is so incredibly flexible and powerful that what I’m working on is the meta application of that tool and giving it the ability to use itself to solve complex problems in layers.
Hopefully that makes sense. I already have a fairly extensive and detailed systems architecture design.
This morning I feel oddly compelled to play the fool so here are some near/medium term thoughts on where this may be going (worth less than what you paid for them):
1. The most important ChatGPT plugin is going to end up being the one that invokes itself recursively. The autoregression approach seems to be severely limiting what these models can do by limiting their ability to think without speaking. Although a few months ago I thought the obvious way to fix this was to train the model to emit special sort of "bracket" tokens that would be deleted by the driver once the inner thought completed, leaving only a sort of "result" section, the GPT-as-a-GPT-plugin effectively does the same thing
2. Whilst the first biggest win from the plugin will be "sub-thoughts", the next biggest will be training it how to dispatch multiple sub-thoughts in parallel. GPT already knows how to break a complex problem down into steps, but is still constrained by context window size and inference speed. Once it is taught how to split a problem up such that multiple independent inference sessions are able to work on it in parallel, it'll become feasible to make requests like "Build me a video game from scratch using Unreal Engine, set in the world of Harry Potter, about the adventures of a character named X" etc and it'll end up dispatching a massive tree of GPT sub-instances which end up working on the independent parts like character generation, writing the Unreal C++, prompting Midjourney and so on.
Parallel recursive LLMs are going to be much more awesome than current LLMs, and I mean that in both senses of the word (cool, awe-inspiring). In particular, this will allow us to pose questions like "How can we cure cancer?".
3. OpenAI need a desktop app, pronto. Whilst the cloud model can take you some way, the most valuable data is locked behind authentication screens. The cloud approach faces difficult institutional barriers, because data access inside organizations is oriented around granting permissions to individuals even when they work in teams. Giving a superbrain superuser access doesn't fit well with that, because there's no robust method to stop the AI immediately blabbing business secrets or PII to whoever tickles it in the right way. That's one reason why the current wave of AI startups is focused on open source technical docs and things. If ChatGPT is given tool access via a desktop app running on the end user's computer, it can access data using the same authentication tokens issued to individuals. This also neatly solves the question of who is accountability for mistakes: it's the user who runs the app.
4. Sandbox engineering is the new black.