Simple in theory, but hard enough in practice that companies like Microsoft screw it up from time to time.
Try saving a text file in Windows XP Notepad with the words "Bush hid the facts" and nothing else. Close it and open the file again. WTF Chinese characters! Conspiracy!
That's not Microsoft "screwing it up", that's you not feeding the algorithm enough characters for it to be really sure. While that short string is below the threshold, the threshold is actually quite surprisingly small; if I remember correctly it's just over 100 bytes and any non-pathological input will be correctly identified with effectively 100% success.
Try saving a text file in Windows XP Notepad with the words "Bush hid the facts" and nothing else. Close it and open the file again. WTF Chinese characters! Conspiracy!