I totally agree with you on LLM usage. I have recently switched from JSON to YAML for requests and replies from LLMs (GPT-4 specifically) and I find it much better:
fewer tokens used, more readable if you are looking at the http requests and responses and you can parse it on the fly in streaming responses. The last point lets you do visual updates for the user, which is pretty important if you need to wait 1+ minutes for the full response
I'd be very curious to know what kind of previews/streaming YAML applications you are building with LLMs. I have building a v0.dev kind of thing with streaming update on my TODO list.