Slightly off-topic, but I thought this would be a good place to ask. Are there a...

gtani · on July 28, 2016

There's a few projects that use ES/lucene as a backend/datastore once the feature engineering is done, but I don't see models operating on the native indexes directly, maybe the format is too different from one-hot (after turning off stemming/stopwords and other info-losing steps)

http://lucene.472066.n3.nabble.com/Where-Search-Meets-Machin...

https://news.ycombinator.com/item?id=11876542

SergeyHack · on July 28, 2016

Not quite about creating synonyms, but in the same area there is Semantic Vectors https://github.com/semanticvectors/semanticvectors.

They process Lucene index and create embedded representation of it. Then you can search over that representation for "semantic" matches.

Last time I checked it about a year ago the embedded collection of documents was kept in the memory and the search was implemented by a linear scan. So I suspect it can be slow on very large collection of documents.