>If I live in London and the distance from me to Paris is 50.000 km, and you liv...

mack73 · on April 4, 2017

Thanks, The idea behind the BK-tree is ingenious.

I'm struggling with finding a use case for that data structure through. Why would you construct a BK-tree that would only become powerful when it contains millions of words, which would then create a nuisance when representing that amount of data in memory, making it not so fast anymore, when you could represent the same data in a compressed form and with the same (as well as an extended set of) querying capabilities?

Perhaps BK-trees are for big machines with powerful CPUs? I'm sure there is a setup that would make that tree in fact better than any other tree.

rocho · on April 4, 2017

I don't think the best use case for BK-trees is spell-checking and words. The area in which they are used most successfully is image deduplication. In that case the metric you're going to use is some form of perceptual hashing.

thesz · on April 5, 2017

I think you are missing important case such as image data. Another one is floating point vectors with scaled up and rounded distance.