On the arguments involving modeling inference as simply some function f, the specific expression OP used discounts that each subsequent application would have been following some backpropagation and so implies a new f' at each application, rendering the claim invalid.
At that point, at least chaos theory is at play across the population of natural language, if not some expressed, but not yet considered truth.
This invalidates the subsequent claim about the functions which are convolved as well, I think all the GPUs might have something to say whether the bits changing the layers are random or correlated.
At that point, at least chaos theory is at play across the population of natural language, if not some expressed, but not yet considered truth.
This invalidates the subsequent claim about the functions which are convolved as well, I think all the GPUs might have something to say whether the bits changing the layers are random or correlated.