Ah yes - you're very right! I'm tempted to try fiddling around with a simple binary search across the string just to see what sort of performance characteristics it generates. I imagine that the result would be comparable to doing a Trie lookup (if a bit slower). Although, as mentioned before, we'll still have the file size overhead of the uncompressed string.