Looking at the source code, the move you are about to make is included in the training dataset. This can be confirmed by playing “scissors scissors scissors rock” and then looking at the variables x and y in the console, which will include the surprise rock.
The code updates x and y, then trains the model, and then makes a prediction. “Fair” code would make a prediction, then update x and y and train the model
This explains the behavior mentioned in the comments where the computer gets an impressive early lead due to the players next move being one of only a few datapoints it learns from, then backs off to a more plausible advantage as the leaked data is diluted by past data.
In other words, the description is precisely inaccurate. Not only is it not untrained, it's trained against your exact plays through mysterious future-knowledge!
after the player makes a move the nn makes ones and add it to it's training data. the move the player does after that is added as a counter move to the nn move. this way I treat the data as time series.
The problem is that that the player's move is actually being fed to the training data before the computer makes its prediction. Take a look at the variable 'y' after making your move and you'll see that it includes the player's last move. Because 'y' is updated before the computer makes its prediction, this shows that the computer uses the player's current move as part of the training set to decide its move.
it's not taking a pick at the player move. read the code, even after the pull request that brought this up the nn performed the same. even if the nn was taking a pick at the player's move, which is not, the data is shuffled before the training
Well, it’s much easier with your code. I immediately opened up a 3x lead on the comp. I stopped at 17-6-17 (W-L-T). What I did was to play the hand that would have beaten the computer’s previous guess, and if the computer looks like it’s figuring it out, I play the hand that beats my strategy once or twice, and then resume.
Two economists – one young and one old – are walking down the street together:
The young economist looks down and sees a $20 bill on the street and says, “Hey, look a twenty-dollar bill!”
Without even looking, his older and wiser colleague replies, “Nonsense. If there had been a twenty-dollar lying on the street, someone would have already picked it up by now.”
(99% joke and snark. Of course this isn't going to work on the stock market, but that kind of efficiency argument isn't perfect.)
I was not making the efficiency argument in the abstract. I'm making it in the concrete of a neural network that can be trained in real time in a browser window.
There's a great deal of people who seem to mistake "the market isn't actually 100% efficient" for "oh, I guess we can just ignore the question of 'efficiency because if it's not 100% it must be 0%". Not that they say the last part out loud, of course.
But that's not what you're coding.
If you trained it with every possible next move and response then you would learn that kind of relationship (though you would also overfit on the existing moves), but this way you just give it a peek into the probability distribution of the players moves, one thats so accurate that it comes from the future...
The code updates x and y, then trains the model, and then makes a prediction. “Fair” code would make a prediction, then update x and y and train the model
This explains the behavior mentioned in the comments where the computer gets an impressive early lead due to the players next move being one of only a few datapoints it learns from, then backs off to a more plausible advantage as the leaked data is diluted by past data.