If this kind of stuff interests you, you should check out the lecture videos from Harvard's Stat 110: http://stat110.net. Prof. Blitzstein is hilariously nerdy but still has a very clear presentation style (he had kind of a cult following back when I took the course [1]). If video isn't your thing, the article here is basically the first few chapters of the textbook [2].
For a moment I thought I would read something like Jaynes book, only simpler. Much examples and definitions, but not many ties to real world problems. It sounds quite arbitrary, in fact.
I think that a good primer needs to talk of probabilities as degrees of belief from the start. Then sketch the basic premises (Probabilities are encoded in real numbers, they abide common sense, and are consistent). Then go on to the [0, 1] interval and state the product and sum rules. We can gloss over Cox theorem, though we probably should stress that the sum and product rules are not axioms, but theorems based on the basic premises above.
And then, one can talk about probability distributions, random variable and "probability space", in terms of the above.
From then, one would understand why a toss of a die yields a uniform distribution: not because the dice itself is "fair", but because we just have no idea whatsoever which side will come up, and don't have any reason to favour any hypothesis over the others. This distribution is not a property of the die, but a measure of our own ignorance.
Does anyone know of a similar resource but with real world examples and perhaps R snippets? Would be great if I could practice two things at the same time.
[1] http://www.youtube.com/watch?v=iAwS7vzvLnY http://www.youtube.com/watch?v=TQvVLhWOiis
[2] http://www.amazon.com/First-Course-Probability-7th-Edition/d...