> Bulletproof JSON generation: Jsonformer ensures that the generated JSON is always syntactically correct and conforms to the specified schema.
This is an important definition to take note of: "bulletproof" doesn't mean that you'll get good or correct data. It only means that it'll be valid JSON and in a particular schema that you specify (because the LLM isn't building the JSON in the first place, the library is).
It's an interesting idea. But it's not clear if they've validated the heuristics they use, to see how well it performs in terms of accuracy against, say, some kind of BeautifulSoup-like attempt to make sense of the JSON-ish that the LLM produces and correct that to be valid JSON, or any other approach to the problem.
I wonder if LLMs are at the point where reprompting the LLM with a very similar error message to what you would output to a human from a user-friendly JSON processing tool for the error would usually be a good way to fix errors.
Sometimes, but it very much depends on the context (no pun intended). If it's a pure syntax issue, OpenAI models will almost certainly make the right correction. If it's more abstract, like the LLM has hallucinated a property that is invalid as part of some larger schema you can quickly descend into the LLM gaslighting you into saying that it has fixed things when it hasn't.
This is an important definition to take note of: "bulletproof" doesn't mean that you'll get good or correct data. It only means that it'll be valid JSON and in a particular schema that you specify (because the LLM isn't building the JSON in the first place, the library is).
It's an interesting idea. But it's not clear if they've validated the heuristics they use, to see how well it performs in terms of accuracy against, say, some kind of BeautifulSoup-like attempt to make sense of the JSON-ish that the LLM produces and correct that to be valid JSON, or any other approach to the problem.