This is a nice post and I hadn't heard of Sattolo's algorithm before. The proof ...

This is a nice post and I hadn't heard of Sattolo's algorithm before. The proof is a bit long though. The reference linked from Wikipedia: http://algo.inria.fr/seminars/summary/Wilson2004b.pdf proves the correctness of Sattolo's algorithm in three sentences. I found it fairly easy to understand, while I didn't manage to read through the linked post in detail to get the same level of understanding. Let me try to explain the proof I understood, without assuming mathematical background, but instead introducing it. (I'll use 0-based indices as in the post, instead of 1-based indices that mathematicians would use.)

# What is a permutation?

There are (at least) two ways to think of (or define) a permutation:

1. A list (an ordering): a specific order for writing out elements. For example, a permutation of the 10 elements [0, 1, 2, 3, …, 9] means those 10 elements written out in a particular order: one particular permutation of 0123456789 is 7851209463. In a computer, we can represent it by an array:

       i  0 1 2 3 4 5 6 7 8 9
    a[i]  7 8 5 1 2 0 9 4 6 3

2. A reordering. For example, the above permutation can be viewed as "sending" 0 to 7, 1 to 8, and in general i to a[i] for each i.

Instead of describing this reordering by writing down 10 pairs 0→7, 1→8, …, 7→4, 8→6, 9→3, we can save some space by "following" each element until we come back to the beginning: the above becomes a bunch of "cycles":

- 0→7→4→2→5⟲ (as 5→0 again) (note we could also write this as 4→2→5→0→7⟲ etc., only the cyclic order matters)

- 1→8→6→9→3⟲ (as 3→1 again)

You can think of cycles the way you think of circular linked lists. This particular permutation we picked happened to have two cycles.

# What is a cyclic permutation?

A cyclic permutation is a permutation that has only one cycle (rather than two cycles as in the above, or even more cycles). For example, consider the permutation 8302741956:

       i    0 1 2 3 4 5 6 7 8 9
    a[i]    8 3 0 2 7 4 1 9 5 6

If we follow each element as we did above, we get 0→8→5→4→7→9→6→1→3→2⟲ where all 10 elements are in a single cycle. This is a cyclic permutation.

Our goal is to generate a random cyclic permutation (and in fact uniformly at random from among all cyclic permutations).

# Sattolo's algorithm

Note that in a cyclic permutation of [0, ..., n-1] (in our example above, n=10), for the highest index n-1, there will be some smaller j such that a[j]=n-1 (in the example above, a[7]=9). Now if we swap the elements at positions n-1 and j (which in the example above is:

    Before                               After
       i    0 1 2 3 4 5 6 7 8 9             i    0 1 2 3 4 5 6 7 8 9
    a[i]    8 3 0 2 7 4 1 9 5 6          a[i]    8 3 0 2 7 4 1 6 5 9

where we swapped a[7]=9 and a[9]=6 to make a[7]=6 and a[9]=9), then in general we get a[n-1]=n-1, and a[0]…a[n-2] form a cyclic permutation of [0…n-2]. In the above example, in the "after" case, if we ignore i=9 and consider only positions 0 to 8, we have the cycle 0→8→5→4→7→6→1→3→2⟲. (This is our original cycle 0→8→5→4→7→9→6→1→3→2⟲ with 9 "removed", as we'd do when deleting an item from a linked list.)

This holds in reverse too: if we had started with the cyclic permutation of [0, …, 8] that is in the "after" column above, added a[9]=9, and swapped a[9]=9 with a "random" element a[7]=6, we'd get the cyclic permutation of [0, … 9] that is the "before" column.

In general, you can convince yourself that there is a unique way of getting any cyclic permutation on [0, …, n-1] by starting with a cyclic permutation on [0, …, n-2], considering a[n-1]=n-1, picking a particular index j in 0 ≤ j ≤ n-2, and swapping a[n-1] and a[j].

This gives the following algorithm, which we've already proved is correct (or derived, rather):

    def random_cycle(n):
        a = [i for i in range(n)]        # For all i from 0 to n-1 (inclusive), set a[i] = i
        for i in range(1, n):            # For each i in 1 to n-1 (inclusive),
            j = random.randrange(0, i)   # Pick j to be a random index in the range 0 to i-1, inclusive
            a[i], a[j] = a[j], a[i]      # Swap a[i] and a[j]
        return a

In the post linked above, you swap with a random element that is "ahead", instead of one that is "behind"; also you start with a list of length n and shuffle it according to the randomly generated cyclic permutation of [0…(n-1)] instead of simply generating the permutation. From the post:

    def sattolo(a):
        n = len(a)
        for i in range(n - 1):
            j = random.randrange(i+1, n)  # i+1 instead of i
            a[i], a[j] = a[j], a[i]

This is slightly different, but the proof is similar: in fact this is the algorithm (except going downwards) that is proved correct in the linked paper. (And even if it is not obvious to you that the two algorithms are equivalent, you have an algorithm that generates a random cycle and is just as easy to code!)