It's incredibly impressive for what it does, but the results don't seem very good.
Although I know from experience it's really difficuly to assess search result quality by hand, you can be very close to something great and return far worse matches than this does.
Yes! The quality probably isn’t as good as Similar Website Finder https://explore2.marginalia.nu/ ;) and I bet using a more recent sentence embedding would lead to better results, I gotta collect more data
Haha, SWF is indeed unreasonably good (although it does something very differently and with a lot more processing power). Although I think in large part because I do no dimension reduction and brute force the cosine similarity calculation with raw 10,000,000 dimension vectors like the programming caveman I am.
Although I know from experience it's really difficuly to assess search result quality by hand, you can be very close to something great and return far worse matches than this does.