Hacker News new | past | comments | ask | show | jobs | submit login

It's interesting, and a bit concerning, that it's so hard to control LLMs from doing things you don't want it to do. Sure, I don't like LLMs censoring stuff. But if I were to build a product using LLMs (aka not a chat service), I'd like to have full control of what it can potentially output. The fact that there is no "prepared statements" or distinction between prompts and injected data makes that hard.



It is concerning, but I am not sure whether it is more concerning than that it's so hard to write a web browser that doesn't execute arbitrary code. Security is like that, and security is especially hard when the system is featureful like web browsers and LLMs.


The issue is that with LLMs it's fundamentally impossible to have a "prepared statement" (the database query concept), whereas a web browser has no problem in principle being a safe sandbox. With LLMs, we have no idea how to make them safe even in principle. This has nothing to do with "security is hard" hand-waving.


I'm excited to share that this is already supported, and I highly recommend leveraging it for safer application deployments. https://platform.openai.com/docs/guides/function-calling


> hard to write a web browser that doesn't execute arbitrary code

It would be easy if only we could define what “code” and “execute” means. The problem is, we can’t. Data is code and code is data. Doing things depending on data is fundamentally the same as executing code.


I reckon this might push app developers to use LLMs locally in the client.

So that even a maliciously behaving LLM can’t cause much damage.


I mean in my mind, the partial point of llm is that you don't control the output. You control the input.

Wanting an generative AI and wanting to cover what it says is like having your cake and eating it too


You want to control certain aspects of the output, and only leave the rest up to the GAI. The issue is that AI models don’t have a reliable mechanism for doing so.


That's not a fundamental limitation of the models, even if it's present in the products running on those models — if you want to populate a database from an LLM, you can constrain the output at each step to be only from the subset of tokens which would be valid at that point.



You control the output during training so no.

And even for humans, we have mechanisms to control their output when they get confused.


> And even for humans, we have mechanisms to control their output when they get confused.

What mechanisms do you mean? I don’t think it’s feasible to use hunger and fear of dismissal to control an instance of an LLM.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: