Hacker News new | past | comments | ask | show | jobs | submit login

A conclusive answer to how neural networks work should be both descriptive and prescriptive. It should not only tell you how they work but give you new insights into how to train them or how to fix their errors.

For example, does your theory tell you how to initialize weights? How the weights in the NN were derived from specific training samples? If you removed a certain subset of training samples, how would the weights change? If the model makes a mistake, which neurons/layers are responsible? Which weights would have to change, and what training data would need to be added/removed to have the model learn better weights?

If you can't answer these sorts of questions, you can't really say you know how they work. Kind of like steam engines before Carnot, or Koch's principles in microbiology, a theory is often only as good as it can be operationalized.




There are reasonable attempts at answering those questions. We're not lost in the woods, we have a lot of precedent in research that has brought us closer to understanding these things.


Would you say that there’s a significant enough effort currently to gain an understanding? I work in fintech, and based the kind and variety of security and compliance controls we have in place I can’t imagine a world where we are permitted to include generative ai technology anywhere near the transaction hot path without >=1 human in the loop. Unless the value is so enormous that it dwarfs the risk. I can see at least one use case here, especially at the integration layer which has a very large and boring spend as partners change/modernize and new partners enter the market.


We already understand enough, and have for many years, to know that the Achilles heel of any system we have today that we consider "AI" is that they are fundamentally statistical methods that cannot be formally verified to act correctly in all cases. Modern day chatbots are going to have the worst time at this since there is very little constraining their behavior and they are explicitly built to be general-purpose. You can make the case for special, constrained tools that have limited variability within defined and appropriate limits, but you can't make the case that the no free lunch theorem has been defeated just because a statistical learning system happens to write text kind of like an English-speaking human might.

It's my personal opinion that there should never be a decision system based on statistical approximations without a human in the loop, particularly if the consequences can affect lives and livelihoods.


Perhaps part of the description of the problem should also be that we understand descriptions of how neural networks work but there are a lot of things about training them that we don’t understand. And by “understanding” I mean we often don’t have a better way of training things than trial and error. Prescriptive definitions. Maybe it works, maybe it doesn’t, maybe it’s suboptimal (it almost certainly is). And a lot of these things are data-dependent.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: