Hacker News new | past | comments | ask | show | jobs | submit login

Pretty much all models, including today's models, already fall foul of the "Hazardous capability" clause. These models can be used to craft persuasive emails or blog posts, analyse code for security problems, and so forth. Whether such a thing is done as part of a process that leads to lots of damage depends on the context, not on the model.

So in practice, only the flops criteria matters. Which means only giant companies with well-funded legal departments, or large states, can build these models, increasing centralization and control, and making full model access a scarce resource worth fighting over.




Really I feel the opposite way, that none of today's models or anything foreseeable meets the hazardous capability criteria. Some may be able to provide automation but I don't see any concrete examples where there's any actual step change in what's possible due to LLMs. The problem is it's all in the interpretation. I imagine some people will think that because a 7B model can give a bullet point list of how to make a bomb (step 1: research explosives) or write a phishing email that sounds like a person wrote it that it's "dangerous". In reality the bar should be a lot higher, like uniquely making something possible that wouldn't otherwise be, with concrete examples of it working or being reasonably likely to work, not just the spectre of targeted emails.

I've been actually thinking there should be a bounty for a real hazardous use of AI identified. The problem would be defining hazardous (which would hopefully itself spur conversation). On one end I imagine trivial "hazards" like what we test models with today (like asking to build a bomb) and on the other it's easy to see there could be a shifting goalposts thing where we keep finding reasons something that technically meets the hazard criteria isn't reall hazardous.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: