It occurred to me that a personal diary/journal is one of the most interesting data sets for a vector embedding / chat context product. People express their hopes, dreams, fears and much more in their journal. A psychiatrist/psychologist/therapist would have many insights from reading a journal.
So this is what I'm trying to build at Jumble Journal. We enable people to chat with their past journals as a first feature.
Technology
For vector embeddings and similarity search, we use ChromaDB. It's open source and has great performance. For something small scale, I didn't want to get locked into one of the Vector DB services like Pinecone.
The DB is hosted on an EC2 instance. Backend API is serverless with AWS Lambda and API Gateway. We back up all embeddings in S3 in case of failure.
I am really happy to discuss the technology stack and the feature itself.
Links
https://jumblejournal.org
https://www.trychroma.com/
The web app is here: https://jumblejournal.org
The DB used is here: https://www.trychroma.com/