I'm teased by the thought that this finding represents the following discovery: Given that the training data always introduces a degree of randomness into the learning process, therefore no matter what method (model) you choose; you've actually chosen an evolutionary model, always - at least in good part. Whether you thought you did, or not. Some models will allow that evolution to take place somewhat faster in absolute values than others will, but more time to evolve (that is, data with which to evolve) is what really matters.
This may be a reminder to "shuffle" your data - or perhaps not to, if you want to jar the neural net. Maybe the former early on, the latter later on, at a guess?
This may be a reminder to "shuffle" your data - or perhaps not to, if you want to jar the neural net. Maybe the former early on, the latter later on, at a guess?