Obviously Microsoft only presented tests that they have passed and at least one other major browser has failed. Still this is free QA for the other browsers so there is not much room to complain.
Actually, if you look at the original tests they submitted it was 100% for IE across the board, and if I recall correctly (since they've just replaced the old page) no other browser passed any subset 100%. Clearly they cherry picked the tests, it makes no sense that every test they wrote just happened to pass at that point in time. Either they didn't publish the tests that IE failed, or they intentionally stopped writing new tests some point before they went public to give them time to pass everything fully.
Of course after they published these results it was pointed out that a) the test were wrong and so other browsers were marked as failing, even when they actually passed, and b) IE was getting things wrong but getting marked correct because the tests were wrong. (That's not even touching the fact that they don't even indicate that Firefox and Webkit would pass a whole heap of the tests just by changing a single string in their code to check a different prefix)
They've since scrambled to get their column back up to 100% pass rate and that's why they've updated this page with their new version. Pretty weak all round.
They didn't cherry pick tests. These weren't written by someone else. They wrote tests for the features they decided to implement. Why would they work on tests right now for features they aren't implementing now?
Come on. Given the option, between two equally complex features, it's preferable to implement the one your competitors got wrong because it offers the more bang for your programmer buck.
This is why the Google Sputnik test results not only shows what percentage of tests pass, but also groups browsers together based on which tests they pass in common (i.e. a crude measure of interoperability).
What's the point in a summary that ignores literally hundreds of tests? Tests that, as you point out, already exist and so would require absolutely no effort to run.
Why have a row in the table that says SVG IE 100% when the truth is very different:
The Google plugin that transforms SVG to Flash passes twice as many tests as the IE9 preview. Opera passes 3 (nearly 4) times as many, yet Microsoft felt it was useful to list a random subset of tests that show them doing well.