Hacker News new | past | comments | ask | show | jobs | submit login

> GPT-3 can't produce images from text prompts

Me: Give me the entire hexadecimal format of an example PNG. Only give me the hexadecimal format.

GPT-3:

89504E470D0A1A0A0000000D49484452000000640000006408020000000065238226000000014944415478DAECFD07780D44204C60F81EADAEF777F7E7E62F1BDE7DEBDED710EC15C7AC81CEEC17069C59B99A1698BEE7A484D68FDE782A7C41A8A0E7D2A2C9B00A99F32FBCED




That's not a valid png. It's just plausible hex tokens. GPT-3 is confidently wrong yet again.


$ echo 89504E470D0A1A0A0000000D49484452000000640000006408020000000065238226000000014944415478DAECFD07780D44204C60F81EADAEF777F7E7E62F1BDE7DEBDED710EC15C7AC81CEEC17069C59B99A1698BEE7A484D68FDE782A7C41A8A0E7D2A2C9B00A99F32FBCED |xxd -r -p > output.png

$ file output.png

output.png: PNG image data, 100 x 100, 8-bit/color RGB, non-interlaced

$ eog output.png Fatal error reading PNG image file: IHDR: CRC error

It seems you are correct.


Yeah, I did something similar before replying. Now maybe GPT-3 could be modified to make PNGs, but someone would have to go do that.


ChatGPT is the language center of the brain, so to speak. If we are trying to model how we believe we are doing things, you'd expect a separate model optimized for images running side by side, with the two interacting (much like Sydney is GPT interacting with search).

And maybe even multiple of each, so that if you give the chatbot a high-level task, it will break it down into subtasks - like it already can - and then hand each task off to a separate instance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: