Don't love it, it's not correct. > what the reward / punishment system really eq...

GaggiX · on April 25, 2023

>There's nothing analogous to a "reward" or "punishment" when neural networks are learning.

Well deep reinforcement learning.

cshimmin · on April 26, 2023

Yeah but even in that case, "reward" is just the thing a NN is trying to predict. The NN itself is not receiving the reward (or any punishment). Instead, it's following gradient signals to improve that estimate of reward, which is then used as a proxy for an optimal policy decision.