Hacker News new | past | comments | ask | show | jobs | submit login

BERT is truly amazing. Almost all inovation in NLP uses BERT and transformers somehow. ALBERT will be the next HUGE thing for the next months, as it show results better than BERT with a small fraction of parameters.

We did a "Semantic Similarity search" for some documents, where we represent a document as a vector using BERT, and had to look for documents close to a reference document.

The results where breathtaking. It really returned semantically similar documents. You can do it now using ElasticSearch(But you really should do it using Vespa.ai, it is much faster https://github.com/jobergum/dense-vector-ranking-performance )




The first project I ever put together involving (extremely trivial) ML used BERT, and something about seeing it just work opened my eyes to the ML world and got me excited to work in the space.

If anyone is interested in hacking around with BERT, I work on an open-source project called Cortex that handles model deployment, and we have full tutorial for deploying a sentiment classifier using BERT quickly and easily: https://github.com/cortexlabs/cortex/tree/master/examples/se...


That's very interesting! If you have the time for it, you should consider experimenting with swapping in SpanBERT[1] instead of BERT in your usecase. They train on full length length segments instead of masked half segments (as in BERT). I suspect that this, besides the improvements that SpanBERT brings over BERT should enable you to feed in bigger chunks (more sentences) to the model before the averaging step, leading to fewer vectors to average and as a result, perhaps better clustering.

[1]: https://arxiv.org/abs/1907.10529


Thank you, I will read and try it. Looks very interesting!


I agree that it works very well for "more like this" document recommendations! But not great for user queries.


but the question is how better they are compared to existing similarity measures. E.g. for documents in a domain, even simple cosine is pretty good.


I did not understood. BERT it is not a similarity measure. For our use case we did use a simple cosine similarity to find the similar documents. But we have to represent those documents in a vector space. From our test representing the document as a MEAN of BERT Embeddings we got some very good results. Much better than BoW, Glove or the Lucene "More like this"


Dude, you are probably releasing trade secrets ;-)


Nah, lots of people are trying stuff like this :)


yes sorry. i meant a more naive, count-based vector representation instead of BERT embedding




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: