Hacker News new | past | comments | ask | show | jobs | submit login
The pain points of building a copilot (austinhenley.com)
60 points by azhenley 11 months ago | hide | past | favorite | 9 comments



In my experience, using TypeScript for prompt management alongside something like zod or TypeBox is super useful for getting rid of the stringly-typed prompt problem. Write a function that wraps template strings for the English bits, and inside the templates for your examples do:

    JSON.stringify({ ... } satisfies MyOutputType)
which ensures your examples always exactly match what you expect the LLM to return, even as you iterate on the prompt and output types. Then just call the LLM and validate the output using zod/TypeBox. Once you're done writing the wrapper function, now instead of managing a bunch of string interpolation when you want to have the LLM analyze something, you have a function that takes some input data, and returns your output type, guaranteed.

With OpenAI tool calling, you can even convert the zod/TypeBox type into JSON Schema, give that to OpenAI, and have a fairly good chance the LLM doesn't make any mistakes without even needing retries. (Although you should still retry if the zod/TypeBox/whatever validator fails.)


If you're using llama.cpp there is also a tool to convert Typescript interfaces to the BNF grammar used to restrict the output tokens: https://grammar.intrinsiclabs.ai/


The paper is full of insightful parts. For instance, on challenge with LLMs is parsing the model output. The participants suggest that instead of forcing the model to return a JSON, it should be allowed to return the results in its preferred format, and then these results can be parsed into a structured format. Quote: "if the model is kind of inherently predisposed to respond with a certain type of data, we don’t try to force it to give us something else because that seems to yield a higher error rate"


Building a copilot the first time will be a mess, the second time much clearer. This is a fun time working with local models. Layer-thinking and abstracting isn't that hard with the right prompt about the specific pain points... Every problem seems to have been solved before.


Great read, their full paper is even more interesting https://arxiv.org/abs/2312.14231


Fantastic write-up. Thanks for sharing.

Shameless plug: I work on CopilotKit - open-source copilot building blocks for react apps. Designed to alleviate exactly the pain points in the article. Devs define simple Copilot entrypoints--

state (frontend + backend + 3rd party), action, purpose-specific LLM chains, etc. And the CopilotKit engine takes care of the rest.

https://github.com/CopilotKit/CopilotKit


Yeah, your Show HN: CopilotKit- Build in-app AI chatbots and AI-powered textareas has some good comments:

https://news.ycombinator.com/item?id=38545207


I have not had a chance to try it yet but Microsoft's Copilot Studio seems to address the "pain points" listed in the article. Can anyone here by chance attest to it?

https://www.microsoft.com/en-us/microsoft-copilot/microsoft-...


I bet the prompt for the (doubtlessly AI-generated) image above the article was "too many tinkerers spoil the robot". Also, wearing glasses seems to be part of the job requirements...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: