In "The Practice of Programming", Brian Kernighan and Rob Pike use the Markov Chain Algorithm to illustrate a number of lessons about software design and implementation. I used it as my "Hello world" program when learning a new programming language.
In Python, the implementation is surprisingly succinct:
import sys
import random
import collections
MAXGEN = 10000
NONWORD = ""
suffixchain = collections.defaultdict(list)
w1 = w2 = NONWORD
for line in sys.stdin:
for word in line.split():
suffixchain[(w1, w2)].append(word)
w1, w2 = w2, word
suffixchain[(w1, w2)].append(NONWORD)
w1 = w2 = NONWORD
for i in range(MAXGEN):
suffixes = suffixchain[(w1, w2)]
next = random.choice(suffixes)
if next is NONWORD:
break
print next,
w1, w2 = w2, next
This is a big reason why speech recognition works. You have a list of probabilities of sequence of words occuring(language model) and it decides on the next word based on the last few previous words.
So if you have the word Super, odds are the next word is Man and not Microphone.
In Python, the implementation is surprisingly succinct:
To use it, just run: