Thanks! Definitely, migration guides are coming. In the meantime, feel free to ping us on our community slack and we'll walk you through a migration process.
- Got a couple of Raspberry Pi's and connected them to wifi.
- Installed https://k3s.io/ on them. One master node and one agent/worker.
- Started adding workloads
- Now I have PiHole, Tailscale, Promtheus, Grafana, Mimir, Homebridge, a custom Go app that I use to create a bunch of metrics, eg. from my solar power system, etc.
- Also added another agent/worker over time.
So start small and easy then go as fancy as you want depending on what you learn/play with.
You can use the Wikipedia embedding's released by Cohere to build something pretty easily : https://huggingface.co/Cohere.
If you want a completely offline version you'd be running one of the open source LLMs locally. Otherwise put the embeddings in a VectorDB, query it for the context and send it to one of the completion APIs available (OpenAI, etc)
You can only have one index in a pod but you can have multiple namespaces in an index and a vector search is constrained to one namespace at a time : https://docs.pinecone.io/docs/namespaces
Depends on the complexity of the requirements. I have built a couple:
- https://crimson-glade-1527.section.app/ : to talk to www.section.io docs. I used straight up python and Gradio. The chatbot doesn't have memory. Just use a CSV for embeddings. Has limited functionality but does it's job well. https://github.com/manibatra/sectiongpt - consider this toy code which was hacked together in a few hours.
- Also building https://www.everbility.com : Using Langchain, Pinecone. I often found Langchain to be a bit of an overkill with it's so many abstractions if your use case is just sending one off prompts. It is powerful though for document ingestion and Agents (which is something I plan to use it more for). When using Langchain you will often have to debug a bunch of edge cases which is understandable since
A random hack I found for improving the quality of answers by a big factor was creating embeddings for entire documents and also sections of document. Resulted in search on the vector space being very accurate for big picture and granular questions.
Is that the inertial scrolling in the extension? I agree the scrolling could be smoother and is on my list to fix. I don't use scrolling much as I rely on fuzzy search hence it's not made the top of the list. Thanks a bunch for the feedback. Appreciate it!
We are already at a stage where AI is touching hearts.