Section 3.1 of Unicode technical report 36 describes a couple of similar things ...

Section 3.1 of Unicode technical report 36 describes a couple of similar things specific to UTF-8: overlong sequences, and ill-formed subsequences.

Does your software conform to Unicode 3.0 and earlier, 3.1 through 5.1, or 5.2 and later? And do your server and client software agree? If you don't know the answer (and you depend on string sanitization), you may be at risk.