Hacker News new | past | comments | ask | show | jobs | submit login

Weird, I was actually searching for a function that would do a "permutation of a list with exactly one cycle" an hour ago. I found Sattolo's, but found its use of random() disagreeable - it would depend on external/stdlib code and generate a new sequence every time if the seed is not fixed. Instead I looked into key schedules for cryptographic ciphers and found an easy solution - the key schedule function for RC4 for example. (My use does not care about the flaws/biases). It's a couple of lines of code, generates the same permutation for the same input+key - just use a couple of arbitrary hardcoded bytes as the key and you are done.



If you want a quick, easy and good quality random number generator, you might want to look at xorshifts.

There are a lot of variations, but I particularly like the 128 bit version from the original paper [1].

Here is the entire code

  unsigned long xor128() {
    static unsigned long x=123456789,y=362436069,z=521288629,w=88675123;
    unsigned long t;
    t=(xˆ(x<<11));x=y;y=z;z=w; 
    return( w=(wˆ(w>>19))ˆ(tˆ(t>>8)) );
  }
Note: Do not use for security applications! It is not cryptographically secure. But if you need that, or if you are running Monte Carlo simulations and want nothing but the best, you probably should use specialized libraries anyways.

[1] https://www.jstatsoft.org/index.php/jss/article/view/v008i14...

Edit: And since we are in the subject of picking random numbers for Sattolo's algorithm. Using a simple modulo operator for get a range from a single random number going from 0 to 2^n-1 (like most RNGs produce) can introduce a bias. If the list is short and the numbers are large, it doesn't make that big a difference, but it is something to consider.


In case anyone is wondering how to do an unbiased choice of the numbers 0 to N given an RNG that generates numbers 0 to 2^Y-1, where 2^Y-1 >= N, there are two ways, both rejecting some samples:

   1. Find the smallest Z s.t. 2^Z-1 <= N, generate a random number, bitwise-and with 2^Z-1, accept the result if the result <= N, otherwise repeat.  The expected number of times through the loop depends on Y and N, but is strictly less than 2.

   2. Let W = floor((2^Z-1) / N), generate a random number, if the number < N*W, return the number mod N, otherwise repeat.  The expected number of times through the loop depends on N and Y, but is no more than the first algorithm.  The downside is that division and modular division are much slower than bitwise-and on most processors.
If division or modular division (on many processors, the same instruction) is much slower than bitwise-and, and your PRNG is pretty fast (such as xorshift / xorishiro), you're better off with the first algorithm.

Note that finding the mask from the first algorithm on a 64-bit machine is as simple as:

  mask = N; for( i = 1; i < 64; i *= 2) { mask |= mask >> i; }  // Check if your compiler unrolls this loop


Something like this:

  def sattolo(a, key):
      n = len(a)
      keylength = len(key)
      j = 0
      for i in range(n - 1):
          j = (j + a[i] + key[i % keylength]) % n
          a[i], a[j] = a[j], a[i]
(key needs to be a byte string because it's easier that way)

--edit--

Doesn't work for producing a singlar path:

  >>> x = [0,1,2,3,4,5,6,7,8,9]
  >>> sattolo(x, b'zanzibar')
  >>> x
  [0, 2, 1, 5, 4, 3, 8, 6, 9, 7]


> permutation of a list with exactly one cycle

I'm having trouble parsing the motivation and terminology. When a list has one cycle, does that mean it repeats once? Or is this basically 'i want a random but stable order of events'?


Permutations can be written as the "product" of "cycles", as described here [1].

If you choose an element belonging to a given cycle, then generate new elements by repeatedly applying the permutation, all the elements you generate will belong only to that cycle.

So, if your permutation can be written as the product of more than one cycle, and you choose a random starting element and generate new elements by successive application of the permutation, you will only generate a strict subset of all the possible elements.

On the other hand, if your permutation only has a single cycle, then that procedure will generate all the elements.

[1] https://en.wikipedia.org/wiki/Permutation#Cycle_notation


Hmm. I think my fundamental disconnect here was the idea of permutation as a repeated process. As a formal term it's really ever come up in a discrete mathematics class comparing it with combinations, on the subject of counting. Here we seem to be defining permutation as a function that can be repeatedly applied.

Looks like the first sentence of wikipedia confirms it's an overloaded word:

>In mathematics, a permutation of a set is, loosely speaking, an arrangement of its members into a sequence or linear order, or if the set is already ordered, a rearrangement of its elements. The word "permutation" also refers to the act or process of changing the linear order of an ordered set.


I'm curious what you are doing where you have access to cryptographic key schedules, but not access to a use of "random()"?


Just learning a new language, so the requirement is self-imposed. But the fun thing with RC4 key schedule is that it doesn't have to be provided, one can just implement it based on the Wikipedia code. I have long found RC4 exciting to implement when trying out new languages - I get a quick dopamine hit because I implemented something... real, and it has several other uses besides encryption - especially when just toying. Today I learned I can use it to shuffle arrays.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: