There really isn't anything to crack open. The models are curves fit to data, the units of the weights are whatever the units of the data are... so, eg., if fit to temp data, then temp.
If you draw a line through measurement data of one kind, you arent getting a function of another: a function is just a map within this space.
Why it should be that drawing a line around shadows is a good prediction for future shadows isn't very mysterious -- no more and no less regardless of the complexity of the object. There isn't anything in the model here which explains why this process works: it works because the light casts shadows in the future the same way it does in the past. If the objects changed, or the light, the whole thing falls over.
Likewise, "generalization" as used in the ML literature is pretty meaningless. It has never hitherto been important that a model 'generalizes' to the same distribution. In science it would be regarded as ridiculous that it could even fail to.
The science sense of generalisation was concerned with whether the model generalizes across scenarios where the relevant essential properties of the target system generated novel distributions in the measure domain. Ie., the purpose of generalization was explanation -- not some weird BS about models remembering data. It's a given that we can always replay-with-variation some measurement data. The point is to learn the DGP>
No explanatory model can "remember" data, since if it could, it would be unfalsifable. Ie., any model build from fitting to historical cases can never fail to model the data, and hence can never express a theory about its generation.
If you draw a line through measurement data of one kind, you arent getting a function of another: a function is just a map within this space.
Why it should be that drawing a line around shadows is a good prediction for future shadows isn't very mysterious -- no more and no less regardless of the complexity of the object. There isn't anything in the model here which explains why this process works: it works because the light casts shadows in the future the same way it does in the past. If the objects changed, or the light, the whole thing falls over.
Likewise, "generalization" as used in the ML literature is pretty meaningless. It has never hitherto been important that a model 'generalizes' to the same distribution. In science it would be regarded as ridiculous that it could even fail to.
The science sense of generalisation was concerned with whether the model generalizes across scenarios where the relevant essential properties of the target system generated novel distributions in the measure domain. Ie., the purpose of generalization was explanation -- not some weird BS about models remembering data. It's a given that we can always replay-with-variation some measurement data. The point is to learn the DGP>
No explanatory model can "remember" data, since if it could, it would be unfalsifable. Ie., any model build from fitting to historical cases can never fail to model the data, and hence can never express a theory about its generation.