You're right. In fact, upon reflection, it is the PD which is not really a dilemma but rather a paradox. The logically correct move in PD (assuming a purely utilitarian quality metric) is always clear: if the game is non-iterated (or iterated with a known horizon) then defect. Otherwise play tit-for-tat. This is true no matter what your opponent does. In the SD the logically correct move depends on your prior on what your opponent is likely to do.
In fact the strategy is simpler if the other player is exactly as rational as you: simply cooperate. The other player, being exactly as rational as you, will follow the same reasoning as you do to come to the same conclusion, and will do exactly the same thing as you. I would always cooperate with a copy of myself, for instance.
There are variants of the prisoner's dilemma where you can check the 'rationality' of your opponent: the prisoners' decisions are chosen by computer programs, and each program is given the other program's source code as an argument. This is often referred to as "program equilibrium", and an optimal strategy is to cooperate iff running the other program with our own source code as an argument results in cooperation.
This runs into problems of computability though: if we run the opponent's code to see what it does, and it turns out that their program runs our code to see what we do, we can get stuck in an infinite loop.
Like most incomputable problems, we can take a conservative approach: try to prove that it cooperates, using some incomplete computable algorithm; defect if we can't prove it. One simple algorithm is to check if the source code is the same as ours; more sophisticated algorithms would perform static analysis of the code.
I never really got my head around the Parametric Bounded Löb paper at https://arxiv.org/abs/1602.04184 . Clever people have persuaded me that it was wrong (for a certain reason related to the existence of nonstandard models of PA) and that it was true.