If you're worried about an incorrect comparison function, you could implement it two different ways.
You could also directly verify (probabilistically) the properties of a total ordering [1]. If your comparison function is "comp" and returns -1, 0, or 1, the tests might go like this:
trials = 1000
while trials > 0:
a = rand_element()
b = rand_element()
c = rand_element()
aa = comp(a, a)
ab = comp(a, b)
ac = comp(a, c)
ba = comp(b, a)
bb = comp(b, b)
bc = comp(b, c)
ca = comp(c, a)
cb = comp(c, b)
cc = comp(c, c)
#reflexive
assert(aa == 0)
assert(bb == 0)
assert(cc == 0)
#antisymmetric
assert(ab == -ba)
assert(ac == -ca)
assert(bc == -cb)
#transitive
assert((ab != bc) or (ab == ac))
assert((ac != cb) or (ac == ab))
assert((ba != ac) or (ba == bc))
assert((bc != ca) or (bc == ba))
assert((ca != ab) or (ca == cb))
assert((cb != ba) or (cb == ca))
assert((ab != 0) or (ac == bc))
assert((ac != 0) or (ab == cb))
assert((ba != 0) or (bc == ac))
assert((bc != 0) or (ba == ca))
assert((ca != 0) or (cb == ab))
assert((cb != 0) or (ca == ba))
trials -= 1
Usually the data is a combination of externally received information and internal calculations/annotations. I just use copious asserts, and one line:
assert bubble_sort(data) == sort(data)
Since this always runs during development it will automatically pick up internal and external changes.
What you laid out is explicitly testing everything against everything else and is 100% comprehensive. Bubble sort is less than 100% comprehensive but still gives good coverage, while optimised sorts just mean you get lucky in the face of buggy comparisons.
You could also directly verify (probabilistically) the properties of a total ordering [1]. If your comparison function is "comp" and returns -1, 0, or 1, the tests might go like this:
[1] http://en.wikipedia.org/wiki/Total_order