The idea of representing a large corpus in a compressed form and learning both c...

The idea of representing a large corpus in a compressed form and learning both concepts as well as semantics about that corpus is facinating to me. A tree like the one described in the article measures semantic distance, much like a character trie does. What data structure support storing or calculating conceptual distances? I've read about word2vec and the mechanics behind the training process but never how that looks in memory or when serialized to disk.

Also, is there a link between the sematic and conceptual distance between two terms, does anyone know?

It would surprise me if there was such a correlation, because the concepts of "near" are different in the semantic and conceptual world. Then again, it would be facinating if there was.