Context: I'm working on an e2ee alternative to Google Photos[1] where we have to cluster embeddings (for face recognition) and run similarity searches (for semantic search[2]) on device.
[1]: https://ente.io
[2]: https://openai.com/research/clip