Show HN: I made an app to use local AI as daily driver

ggerganov · 2024-02-28T09:14:51.000000Z

> Thanks to the amazing work of @ggerganov on llama.cpp which made this possible. If there is anything that you wish to exist in an ideal local AI app, I'd love to hear about it.

The app looks great! Likewise, if you have any requests or ideas for improving llama.cpp, please don't hesitate to open an issue / discussion in the repo

xyc · 2024-02-28T09:37:20.000000Z

Oh wow it's the goat himself, love how your work has democratized AI. Thanks so much for the encouragement. I'm mostly a UI/app engineer, total beginner when it comes to llama.cpp, would love to learn more and help along the way.

titaniumtown · 2024-02-28T14:59:35.000000Z

Wow I've been following your work for a while, incredible stuff! Keep up the hard work, I check llama.cpp's commits and PRs very frequently and always see something interesting in the works (the alternative quantization methods and Flash Attention have been interesting).

petargyurov · 2024-02-28T09:24:59.000000Z

Did not expect to see the Georgi Gerganov here :) How is GGML going?

Поздрави!

ggerganov · 2024-02-28T10:29:04.000000Z

So far is going great! Good community, having fun. Many ideas to explore :-)

duckkg5 · 2024-02-28T13:13:22.000000Z

Nothing to add except that your work is tremendous

tartrate · 2024-02-28T08:43:50.000000Z

> Full Text Search. Blazingly fast search over thousands of messages.

Natural language processing has come full circle and just reinvented Ctrl+F.

I had to double check that a regular '90s search function was actually the thing being advertised here, and sure enough, there is a gif demonstrating exactly that.

addandsubtract · 2024-02-28T09:12:07.000000Z

Ctrl+F only gets you so far. It doesn't allow you to perform semantic searches, for example. If you don't happen to know a unique word (or set of words) to search for, you're out of luck.

Just the other day, I was able to find a song by typing the phonetic pronunciation (well, as best I could) into ChatGPT, and it knew which song I was talking about right away. No way a regular search engine would've helped me there.

danielovichdk · 2024-02-28T11:50:12.000000Z

No. Your own data only gets you so far. And this is exactly the issue. No local model will make sense because the dataset its given is so small compared to what you are referring to - chatgpt.

It's useless locally.

K0balt · 2024-03-03T10:56:46.000000Z

I’ve been using 7b models to work with large text volumes (like entire books) with nothing short of phenomenal results. It has cut the time I need to accomplish many tasks by >90% and often offers insights I might easily have missed. My methodology requires a bit of time and compute to prepare a new subject matter expert system, but the results are absolutely worth it.

danielovichdk · 2024-03-06T09:16:30.000000Z

Would you care to share how you technically are doing this ? Please

lee-rhapsody · 2024-02-29T18:35:41.000000Z

If you only need to query your own data, and can't upload that data to ChatGPT for compliance/security reasons, I think local LLMs are far from useless.

danielovichdk · 2024-03-02T11:57:02.000000Z

I have been downvoted here which is fine. But no one took the time to elaborate on where i am wrong ? Tell me where i am wrong please.

replwoacause · 2024-02-29T15:29:59.000000Z

Yes but this is missing from most chatbot UIs (ChatGPT, Gemini, Claude, etc. ) so is therefore very useful. Machato has it but it is very laggy.

Grimblewald · 2024-02-29T03:11:18.000000Z

I'm a big fan of ctrl+f, but semantic search is a life saver that conventional search simply cannot compare to.

davely · 2024-02-28T17:44:29.000000Z

Yeah, I think the call out here is specifically because you the ChatGPT interface doesn't have a search feature (on web). Interestingly, on their iOS app, you can search.

I often find myself opening the app on my phone if I want to find a previous conversation, even if I'm at my desk.

antifa · 2024-03-01T03:49:37.000000Z

The search feature on ChatGPT for Android only works for a tiny number of recent chats.

Moldoteck · 2024-02-29T11:25:08.000000Z

i'm more interested how to perform semantic search over messages efficiently. Like to receive the reference to the og message. Is it creating an llm response with the potential content and how does it find the og message? is it performing tf/idf+cosine search after that or how?

behnamoh · 2024-02-28T14:28:06.000000Z

and yet ChatGPT doesn't support it.

girishso · 2024-02-28T01:35:40.000000Z

I will totally pay for something like this if it answers from my local documents, bookmarks, browser history etc.

vunderba · 2024-02-28T04:42:27.000000Z

There are already several RAG chat open source solutions available. Two that immediately come to mind are:

Danswer

https://github.com/danswer-ai/danswer

Khoj

https://github.com/khoj-ai/khoj

wkat4242 · 2024-02-28T10:33:45.000000Z

Stupid question but what does RAG stand for?

onehp · 2024-02-28T11:01:12.000000Z

Retrieval augmented generation. In short you use an LLM to classify your documents (or chunks from them) up front. Then when you want to ask the LLM a question you pull the most relevant ones back to feed it as additional context.

danielovichdk · 2024-02-28T11:46:06.000000Z

I dont get it. To my understanding it takes huge amounts of data to build any any form of RAG. Simply because it enlarges the statistical model you later prompt. If the model is not big enough how would you expect it to answer you in a non qualifying matter ? It simply can't.

So I don't really buy it and I have yet to see it work better than any rdbms search index.

Tell me I am wrong, I would like to see a local model based on my own docs being able to answer me quality answers based on quality prompts.

tveita · 2024-02-28T12:11:22.000000Z

RAG doesn't require much data or involve any training, it is a fancy name for "automatically paste some relevant context into the prompt"

Basically if you have a database of three emails and ask when Biff wanted to meet for lunch, a RAG system would select the most relevant email based on any kind of search - embeddings are most fashionable, and create a prompt like

"""Given this document: <your email>, answer the question "When does Biff want to meet for lunch?"""

loudmax · 2024-02-28T13:24:02.000000Z

That's not how RAG works. What you're describing is something closer to prompt optimization.

Sibling comment from discordance has a more accurate description of RAG. There's a longer description from Nvidia here: https://blogs.nvidia.com/blog/what-is-retrieval-augmented-ge...

tveita · 2024-02-28T14:01:53.000000Z

Right, you read something nebulous about how "the LLM combines the retrieved words and its own response to the query into a final answer it presents to the user", and you think there is some magic going on, and then you click one link deeper and read at https://ai.meta.com/blog/retrieval-augmented-generation-stre... :

> Given the prompt “When did the first mammal appear on Earth?” for instance, RAG might surface documents for “Mammal,” “History of Earth,” and “Evolution of Mammals.” These supporting documents are then concatenated as context with the original input and fed to the [...] model

Finding the relevant context to put in the prompt is a search problem, nearest neighbour search on embeddings is one basic way to do it but the singular focus on "vector databases" is a bit of hype phenomenon IMO - a real world product should factor a lot more than just pure textual content into the relevancy score. Or is your personal AI assistant going to treat emails from yesterday as equally relevant as emails from a year ago?

machiaweliczny · 2024-02-28T14:58:57.000000Z

Legit explanation, that's how it works AFAIK.

discordance · 2024-02-28T12:27:42.000000Z

RAG:

1. First you create embeddings from your documents

2. Store that in a vector db

3. Ask what the user wants and do a search in the vector db (cosine similarity etc)

4. Feed the relevant search results to your LLM and do the usual LLM stuff with the returned embeddings and chunks of the documents

bigfudge · 2024-02-28T13:56:43.000000Z

Although RAG is often implemented via vector databases to find 'relevant' content, I'm not sure that's a necessary component. I've been doing what I call RAG by finding 'relevant' content for the current prompt context via a number of different algorithms that don't use vectors.

Would you define RAG only as 'prompt optimisation that involves embeddings'?

eevmanu · 2024-02-28T20:00:58.000000Z

Sure thing, your RAG approach sounds intriguing, especially since you're sidestepping vector databases. But doesn't the input context length cap affect it? (chatgpt plus at 32K [0] or gpt4 via open ai at 128K [1]) Seems like those cases would be pretty rare though.

[0]: https://openai.com/chatgpt/pricing#:~:text=8K-,32K,-32K

[1]: https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turb...

bigfudge · 2024-03-01T08:22:51.000000Z

Yes, context window is a limiting factor, but that's true however you identify the content to augment generation.

Grimblewald · 2024-02-29T03:15:04.000000Z

You're misunderstanding. Imagine your query is matched against chunks of text from the database, where the relevance of information is evaluated for the window each time it slides. Then collecting the n most relevant chunks, these are included in the prompt so then the llm can provide answers from source documents verbatim. This is useful for cases where precise and exact answers are needed. For example, searching the docs for some package for the right api to call. You don't mant a name that's close to right it has to be correct to the character.

wkat4242 · 2024-02-29T03:27:00.000000Z

Ahh ok I see. It's basically what MS CoPilot 365 does too, with its "Grounding" step.

brianjking · 2024-02-29T12:04:30.000000Z

chb · 2024-02-28T02:01:27.000000Z

This. There was a post in HN last week, iirc, referring to just such a solution called ZenFetch (?). I would have adopted it in a heartbeat but they don’t currently have a means of exporting the source data you feed to it (should you elect it as your sole means of bookmarking, etc)

gabev · 2024-02-28T03:52:14.000000Z

Hey there,

This is Gabe, the founder of Zenfetch. Thanks for sharing. We're putting together an export option where you can download all your saved data as a CSV and should get that out by end of week.

samstave · 2024-02-28T17:21:04.000000Z

Seems like this would be a good tool to build lessons on - if you could share a "class" and export a link for others to then copy the class, and expand on the lesson/class/topic into their own AI. but as a separate "class" and not fully integrated to my regular history blob?

I want the ability to search all my downloaded files and organize them based on context within. Have it create a category table, and allow me to "put all pics of my cat in this folder, and upload them to a gallery on imgur."

gabev · 2024-02-28T18:27:09.000000Z

We're working on the ability to share folders of your knowledge so that others can search/chat across them.

We've been thinking of this as a "subscription" to the creator's folder. Similar to how you might subscribe to a Spotify playlist

toomuchtodo · 2024-02-29T04:54:45.000000Z

Consider using tar files for this. Lots of tooling (versioning, hashing, storage) around this already, and docker layers comes to mind.

samstave · 2024-02-28T19:55:10.000000Z

Or aN RSS?

xyc · 2024-02-28T01:51:09.000000Z

Yes it would be the next big focus on this. Personal data connectivity is what I see where local AI would excel - despite model power differences.

Satam · 2024-02-28T05:37:03.000000Z

I have doubts about that. Most personal data actually lives in the cloud these days. If you need your Gmail emails, you'll need to use their API which is guarded behind $50k certification fee or so. I think there is a simpler version for personal use, but you still need to get the API key. Who's going to teach their mom about API keys? So I think for a lot of these data sources you'll end up with enterprise AIs integrating them first for a seamless experience.

coev · 2024-02-28T08:38:52.000000Z

Why wouldn't you be able to use IMAP over the gmail api? IMAP returns the text and headers of all your emails, which is what you'd want the LLM to ingest anyway.

noduerme · 2024-02-28T09:39:56.000000Z

Seconding a sibling question: What $50k API fee? To access your gmail? I've been using gmail since 2008 or so without ever touching their web/app interface or getting an API key. You just use it as an IMAP server.

Satam · 2024-02-28T10:27:43.000000Z

To use Google's sensitive APIs in production you have to certify your product and that costs tens of thousands. To be honest, didn't think about imap at first, but it looks like that could be getting tougher soon too https://support.google.com/a/answer/14114704?hl=en. Soon they will require oAuth for imap and with oAuth you'll need the certification: https://developers.google.com/gmail/imap/xoauth2-protocol. If it's for personal use, you might be able to get by with just with some warnings in the login flow but it won't be easy to get oAuth flow setup in the first place.

noduerme · 2024-02-28T11:09:43.000000Z

Yeah, Thunderbird integrated oAuth in the last few releases, mainly to keep up with the Gmail and Hotmail requirements. Made it very user-friendly to set up in the GUI right within T-bird. I don't see this being a major obstacle.

I'm not sure I can imagine a scenario in production where Google would, or should, allow API access to individual gmail accounts. What's that for? So you can read all your employees' mail without running your own email server?

akerl_ · 2024-02-28T18:56:16.000000Z

I'm not sure what you mean.

> You will no longer use a password for access (with the exception of app passwords)

I'm not seeing anywhere that I'd need to pay money to use OAuth via an app like Thunderbird or another email client. That app would either need to support using OAuth to let the user auth and get credentials, or use an app password.

Satam · 2024-02-28T23:06:06.000000Z

Right, but Thunderbird had to pay up and set themselves as a middleman to allow this. My point is that local LLMs might not have that many advantages for personal data because most of that data doesn't live locally on your computer, to begin with. I guess an argument could be made that running them locally prevents an AI provider from gobbling up ALL of your data. On the other hand, Google already has most of our my data: emails, youtube, gmail, etc.

xyc · 2024-02-28T08:35:49.000000Z

I think this is a good take. While there's big enough niche for personal data locally, I'd love if there's a way to solve for email/cloud data requiring API keys.

noduerme · 2024-02-28T09:44:49.000000Z

Ideally, though, a sufficiently smart LLM shouldn't need API access. It could navigate to your social media login page, supply your credentials, and scrape what it sees. Better yet, it should just reverse-engineer the API ;)

samstave · 2024-02-28T17:22:30.000000Z

What?

I manage both gmail and protonmail via thunderbird - where I have better search and sort using IMAP.

_boffin_ · 2024-02-28T03:12:38.000000Z

Good to know there's a market for that. Currently building out something. Integrating from numerous sources, processing and then utilizing those.

nice.

chaostheory · 2024-02-28T02:30:55.000000Z

Yeah, we’re getting closer to “Her”

ssnri · 2024-02-28T04:28:11.000000Z

I would even let it have longer processing times for queries to apply against each document in my system, allow it to specialize/train itself on a daily basis…

Use all the resources you want if you save me brainpower

xyc · 2024-02-28T05:58:33.000000Z

Agree, there's a non real-time angle to this.

samstave · 2024-02-28T17:25:16.000000Z

"give me a summary of the news around this topic each morning for my daily read"

Help me plan for upcoming meetings whereby if I put something in calendar, it will build a little dossier for the event, and include relevant info based on the type of event or meeting, mostly scheduling reminders or prompting you with updates or changes to the event etc.

ssnri · 2024-02-28T17:45:58.000000Z

“filter out baby pictures from my family text threads”

jlund-molfese · 2024-02-28T02:42:29.000000Z

Sounds like https://www.rewind.ai/ ?

spiderfarmer · 2024-02-28T08:04:19.000000Z

Next version of MacOS will probably have that.

tethys · 2024-02-28T08:15:01.000000Z

As long as you use Safari for browsing, Notes for note taking, iCloud for mail …

toomuchtodo · 2024-02-28T03:06:24.000000Z

https://news.ycombinator.com/item?id=38787892 ("Show HN: Rem: Remember Everything (open source)") ?

https://github.com/jasonjmcghee/rem

scottrblock · 2024-02-28T02:36:43.000000Z

plus one, I would love to configure a folder of markdown/txt(+ eventually images and pdfs) files that this can have access to. Ideally it could RAG over them in a sensible way. Would love to help support this!

xyc · 2024-02-28T06:05:51.000000Z

Thank you! I'd love to learn more about your use cases. Would you mind sending an email to feedback@recurse.chat or DM me on https://x.com/chxy to get the conversation started?

Fnoord · 2024-03-01T14:43:15.000000Z

I use paperless-ngx for that.

code51 · 2024-02-28T08:19:51.000000Z

Thank you for the work.

Please take this in a nice way: I can't see why I would use this over ChatbotUI+Ollama https://github.com/mckaywrigley/chatbot-ui

Seem the only advantage is having it as MacOS native app and only real distinction is maybe fast import and search - I've yet to try that though.

ChatbotUI (and other similar stuff) are cross-platform, customizable, private, debuggable. I'm easily able to see what it's trying to do.

ayhoung · 2024-02-28T08:59:43.000000Z

Not everyone is a dev

Alifatisk · 2024-02-28T11:52:01.000000Z

HN users keep forgetting that

kvakkefly · 2024-03-03T08:10:10.000000Z

Dangerously close to the old famous Dropbox comment to the original ShowHN post

vood · 2024-02-28T12:03:54.000000Z

Thanks for sharing ChatbotUI. While I'm not an author, I use it extensively and contribute to it. Thanks to the permissive license, I could offer ChatbotUI as a hosted solution with our API keys. https://labs.writingmate.ai.

domano · 2024-02-28T12:33:19.000000Z

Hey, i bought it, nice work!

A few things:

* The main thing that makes ChatGPTs ui useful to me is the ability to change any of my prompts in the conversation & it will then go back to that part of the converation and regenerate, while removing the rest of the conversation after that point.

Such a chat ui is not usable for me without this feature.

* The feedback button does nothing for me, just changes focus to chrome.

* The LLaVA model tells me that it can not generate images since it is a text based AI model. My prompts were "Generate an image of ..."

wodow · 2024-02-28T12:41:36.000000Z

> * The main thing that makes ChatGPTs ui useful to me is the ability to change any of my prompts in the conversation & it will then go back to that part of the converation and regenerate, while removing the rest of the conversation after that point.

Agreed, but what I would also really like (from this and ChatGPT) would be branching: take a conversation in two different ways from some point and retain the seperate and shared history.

I'm not sure what the UI should be. Threads? (like mail or Usenet)

ItsMattyG · 2024-02-28T14:54:42.000000Z

ChatGPT does this. You just click an arrow and it will show you other branches.

ApolloFortyNine · 2024-02-28T16:45:55.000000Z

I have ChatGPT4, I have no idea what arrow you are talking about. Could you be more specific? I see now arrow on any of my previous messages or current ones.

wodow · 2024-02-28T17:22:07.000000Z

By George, ItsMattyG is right! After editing a question (with the "stylus"/pen icon), the revision number counter that appears (e.g. "1 / 2") has arrows next to it that allow forward and backward navigation through the new branches.

This was surprisingly undiscoverable. I wonder if it's documented. I couldn't find anything from a quick look at help.openai.com .

toomuchtodo · 2024-02-29T04:56:07.000000Z

Careful what you trust with help.openai.com. You used to be able to share conversations, now it's login walled when you share, and the docs don't reflect this (if someone can recommend a frontend that has this functionality, for quick sharing of conversations with others via a link, taking recommendations, thank you in advance).

python273 · 2024-03-01T12:36:12.000000Z

I have a very simple UI with threading. It's really unpolished though.

https://eimi.cns.wtf/

https://github.com/python273/eimi

xyc · 2024-02-28T18:29:24.000000Z

Nice suggestion! Threading / branching won't be too crazy to support. I'll explore ChatGPT style branch or threads and see what'll work better.

shanusmagnus · 2024-02-28T14:21:18.000000Z

1000 upvotes for you. My brain can't compute why someone hasn't made this, along with embeddings-based search that doesn't suck.

rhaps0dy · 2024-02-28T15:20:59.000000Z

They did make it, in 2021. https://generative.ink/posts/loom-interface-to-the-multivers... (click through to the GitHub repo and check the commit history, the bulk of commits is at least 3 years old)

FredPret · 2024-02-28T14:59:36.000000Z

I bet UI and UX innovation will follow, but model quality is the most important thing.

If I were OpenAI, I would 95% of resources on ChatGPT5, and 5% into UX.

Once the dust settles, if humanity still exists, and human customers are still economically relevant, AI companies will shift more resources to UX.

shanusmagnus · 2024-03-11T14:26:30.000000Z

I understand your point, but my take is that when we talk about AI and its impact, we're talking about the entire system: the model, and what is buildable with the model. To me, the gains available from doing innovative stuff w/ what we're colloquially calling "UI" exceeds, by a bunch, what the next model will unlock. But perhaps the main issue is that whatever this amazing UI might provide, it's not protectable in the way the model is. So maybe that's the answer.

xyc · 2024-02-28T18:14:31.000000Z

Thank you for the support and the valuable feedback! Sorry about the response time, I haven't expected the incoming volume of requests.

* For changing prompt in the middle - I'll take a crack at it this week. It's on top of my post launch list.

* Feedback button: Thanks for reporting this. The button was supposed to open default email client to email feedback@recurse.chat

* LLaVA model: I'll add more documentation. You are right Llava could not generate images. It can only describe images (similar to GPT-4v). For image generation, it's not supported in the app. While I don't have immediate plans for image generation, check out these projects for local image generation.

- https://diffusionbee.com/

- https://github.com/comfyanonymous/ComfyUI

- https://github.com/AUTOMATIC1111/stable-diffusion-webui

pps · 2024-02-28T12:55:22.000000Z

> The LLaVA model tells me that it can not generate images since it is a text based AI model.

Because it can't generate images, it can only describe images provided by the user.

giblfiz · 2024-02-28T03:05:04.000000Z

So there are a few questions that leap out at me:

  * What are you using for image generation? Is that local as well (stable diffusion?) Does it have integrated prompt generation? 


  * You mention the ability to import ChatGPT history, are you able to import other documents?

  * How many "agent" style capacities does it have? Can it search the web? use other APIs? Prompt itself? 

  * Does it have a plugin framework? you mention that it is "customizable" but that can mean almost anything. 

  * What is the license? what assurances do users have that their usage is private? I mean, we all know how many "local" apps exfiltrate a ton of data.

hanniabu · 2024-02-28T16:08:50.000000Z

> What are you using for image generation?

It doesn't look like it supports image generation unfortunately. If it did then I would definitely adopt this as my daily driver.

brigleb · 2024-02-28T19:03:39.000000Z

Cool, instant buy for me. A few little suggestions:

- Make the system font (San Francisco) an option for the UI. Maybe even SF Mono as an option as well?

- A little more help about which model to use for beginners would be nice. Maybe just an intro screen telling you how to get going.

- Would be great if Command-comma opened settings, like most Mac apps.

- Would be great if clicking web links opened Safari (or my preferred browser), rather than a small window that loads nothing!

xyc · 2024-02-28T20:02:08.000000Z

Thank you! and thanks so much for the feature suggestions:

- Make the system font (San Francisco) an option for the UI. Maybe even SF Mono as an option as well?

Reasonable request! Won't be too hard to add

- A little more help about which model to use for beginners would be nice. Maybe just an intro screen telling you how to get going.

Yes better onboarding wizard would definitely make this easier for beginners. Don't have much capacity right now, but I'll keep this in mind.

- Would be great if Command-comma opened settings, like most Mac apps.

Nice suggestion. Will probably get to this when I add some keyboard shortcuts like new chat / search etc.

- Would be great if clicking web links opened Safari (or my preferred browser), rather than a small window that loads nothing!

Ah that's odd, it's supposed to open the link. which link do you have if you don't mind sharing? (feel free to email support@recurse.chat)

rexreed · 2024-02-28T02:39:03.000000Z

What are the MacOS and hardware requirements? How does it perform on a slightly older model, lower powered Mac? I wish I could test this to see how it would perform, and while it's only $10, I don't want to spend that just to realize it won't work on my older, underpowered Mac mini.

xyc · 2024-02-28T05:32:05.000000Z

Good question, I'll put some system requirements on the website. It only supports mac with Apple Silicon now, if that's helpful.

pantulis · 2024-02-28T08:17:48.000000Z

Instant buy, great work and the price point is exactly right. Good luck!

xyc · 2024-02-28T18:26:04.000000Z

Appreciate your support. Thank you so much!

CGamesPlay · 2024-02-28T01:26:08.000000Z

Possibly a strange question, but do you have plans to add online models to the app? Local models just aren't at the same level, but I would certainly appreciate a consistent chat interface that lets me switch between GPT/Claude/local models.

iansinnott · 2024-02-28T01:37:00.000000Z

You could try out Prompta [1], which I made for this use case. Initially created to use OpenAI as a desktop app, but can use any compatible API including Ollama if you want local completions.

[1]: https://github.com/iansinnott/prompta

CGamesPlay · 2024-02-28T01:59:12.000000Z

This one doesn't seem to support system prompts, which are absolutely essential for getting useful output from LLMs.

iansinnott · 2024-02-28T02:33:29.000000Z

You can update the system prompt in the settings. Admittedly this is not mentioned in the README, but is customizable.

refulgentis · 2024-02-28T04:09:41.000000Z

> the system prompt

There isn't a singular system prompt. It really does matter!

Copy the OpenAI playground, you'll thank yourself later

iansinnott · 2024-02-28T08:25:34.000000Z

Fair point, and it's not implemented that way currently. It's more like "custom instructions" but thanks for pointing that out. I haven't used multiple system prompts in the OpenAI playground either, so I hadn't given it much thought.

8n4vidtmkvmk · 2024-02-28T05:50:56.000000Z

You use multiple system prompts in a single chat? What for?

a_bonobo · 2024-02-28T03:01:02.000000Z

I've run into the same problem with deploying Gemini locally, it does not seem to support System Prompts. I've cheated around this by auto-prepending the system prompt to the user prompt, and then deleting it from the user-displayed prompt again.

derwiki · 2024-02-28T02:32:01.000000Z

Can you speak more to this? I get useful output from LLMs all the time, but never use system prompts. What am I missing?

CGamesPlay · 2024-02-28T03:34:14.000000Z

Sure, I use one system prompt template to make ChatGPT be more concise. Compare these two: https://sharegpt.com/c/fEZKMIy vs https://sharegpt.com/c/S2lyYON

I use similar ones to get ChatGPT to be more thorough or diligent as well. From my limited experience with local models, this type of system prompting is even more important than with ChatGPT 4.

addandsubtract · 2024-02-28T09:17:17.000000Z

Is there a difference in using a system prompt and just pasting the "system prompt" part at the beginning of your message?

CGamesPlay · 2024-02-28T10:20:22.000000Z

Haven't tested, but having it built-in is more convenient, and convenience is why I'm using these tools in the first place (as a replacement for StackOverflow, for example).

xyc · 2024-02-28T01:32:33.000000Z

Not strange at all! It's a very valid ask. The focus is local AI, but GPT-3.5/GPT-4 are actually included in the app (bring your own key), although customization is limited. Planning to expose some more customizability there including API base urls / model names.

longnguyen · 2024-02-28T08:17:45.000000Z

Shameless plug: if you need multiple AI Service Provider, give BoltAI[0] a try. It’s native (not Electron), and supports multiple services: OpenAI, Azure OpenAI, OpenRouter, Mistral, Ollama…

It also allows you to interact with LLMs via multiple different interfaces: Chat UI, a context-aware called AI Command and an Inline mode.

[0]: https://boltai.com

castles · 2024-02-28T01:32:38.000000Z

https://recurse.chat/faq/#:~:text=We%20support%20Mistral%2C%...

christiangenco · 2024-02-28T02:11:14.000000Z

...how did you highlight a specific sentence like that?

sandyarmstrong · 2024-02-28T02:20:28.000000Z

Looks like a Chromium-specific feature: https://web.dev/articles/text-fragments

Pretty cool. Doesn't work on Firefox.

QuinnyPig · 2024-02-28T02:47:16.000000Z

It just worked on Safari on iOS. That’s pretty impressive.

svat · 2024-02-28T06:44:57.000000Z

https://caniuse.com/url-scroll-to-text-fragment -- yes Safari supports it

cooper_ganglia · 2024-02-28T04:53:18.000000Z

I read the website for 30 seconds and instantly bought it.

It's clean, easy to use, and works really well! Easy local server hosting was cool, too. I've used the other LLM apps, and this feels like those, but simplified. It just feels good to use. I like it a lot!

I'm gonna test drive it for a while, and if I keep using it regularly, I'll definitely be sending in some feedback. Other users have made a lot of really great recommendations already, I'm excited to see how this evolves!

xyc · 2024-02-28T05:26:02.000000Z

Thanks so much for the kind words and giving it a spin!

Feel free to send feedback, issues, feature suggestion as you use it more, I'm all ears. My twitter DM is also open: https://x.com/chxy.

madduci · 2024-02-28T05:32:25.000000Z

Any chance to see it available on other operating systems as well?

xyc · 2024-02-28T06:29:49.000000Z

Unfortunately not now. If you are interested in email updates: https://tally.so/r/wzDvLM

devinprater · 2024-02-28T05:04:31.000000Z

There's another one someone made for blind users like themselves and me, called Vollama (they use a mac, so VoiceOver + Llama). It's really good. I haven't tested many others for accessibility, but it has RAG and uses Ollama as backend, so works very well for me.

https://github.com/chigkim/VOLlama/

chown · 2024-02-28T05:19:39.000000Z

It's very nice that there exists something like that. I am an author of one of the similar apps [1] someone listed in a different thread. I was hoping I could get in touch with someone like you who could give me some feedback on how to make my app more accessible for users like you. I really want to it be an "LLM for all" kind of app but despite my best efforts and intention, I suck at it. Any chance of getting in touch with you and get some feedback? Only if you want and have time, no pressure at all.

[1] https://msty.app

devinprater · 2024-02-28T05:47:34.000000Z

Sure, I'll probably join the discord tomorrow morning, but a few notes:

* For apps like this, using live regions to speak updates may be helpful. either that or change the buttons, like from "download local AI" to "configuring." Maybe a live region would be best for that one since sighted people would probably be looking near the bottom for the status bar, but anyway... * Using live regions for chats is pretty important, because otherwise we don't know when a message is ready to read, and it makes reading those messages much simpler. The user types the message, presses Enter, and the screen reader reads the message to them. So, making a live region, and then sending the finished message, or a finished part of a message, to that live region would be really helpful. * Now on to the UI. At the top, we have "index /text-chat-sessions". I guess that should just say "chats"? Below that, we have a list, with a button saying the same thing. After that list with one item, is a button that says "index /local-ai". That should probably just be "local AI". Afterwards, there is "index /settings", which should just be "settings." Then, there is an unlabeled button. I'm guessing this is styled to look like a menu bar, across the top of the window, so it'd be the item on the right side. Now, there's a button below that that says "New Chat^N". I, being a technical user, am pretty sure the "^N" means "Control + N", but almost no one else knows that. So, maybe change that text label. Between that and the Recent Chats menu button are two unlabeled buttons. I'm not sure why a region landmark was used for the recent chats list, but after the chat name "hello" in this case, where I can rename the chat, there is an unlabeled button. The button after the model chooser is unlabeled as well. After the user input in the conversation, there are three unlabeled buttons. After the response, there is a menu button with (oh, that's cool) items to transform the response into bullets, a table, ETC. but that menu button was unlabeled so I had to open it to see what's inside. After that, all other buttons, like for adding instructions to refine this message, are also unlabeled.

So, live regions for speaking chat messages and state changes like "loading" or "ready" or whatever (keep them short), and label controls, and you should be good to go.

Live regions: https://developer.mozilla.org/en-US/docs/Web/Accessibility/A...

chown · 2024-02-28T10:23:12.000000Z

Wow! This is already very helpful and was the kind of feedback I was looking for. Thank you!

indit · 2024-02-28T10:39:22.000000Z

Hi, I just use msty. Could it use an already downloaded gguf file?

chown · 2024-02-28T19:30:05.000000Z

Not right now but that’s something we plan to support soon. Supporting Ollama downloaded models is getting released either today or tomorrow, gguf support might go into the next release. Would love to chat with you to learn more about your use case. Mind saying hi on our Discord?

karolist · 2024-02-28T14:33:01.000000Z

Hey. I'm sorry about your condition. I feel I'm approaching blindness eventually, this is very random, but perhaps you could share any resources I could learn to prepare for this so I could continue using the web when/if it happens.

devinprater · 2024-02-28T15:39:06.000000Z

I'll try. To get things started, if you have an iPhone, check out AppleVis:

https://applevis.com/

If you have Android:

https://blindandroidusers.com/

I believe Hadley is still a good resource: https://hadleyhelps.org/welcome-hadley

I hope this helps get you started.

rkuodys · 2024-02-28T07:03:58.000000Z

Honest question - can it be used for programming? Or anyone maybe can recommend local-first development LLM which would take in all project (Python / Angular) and write code based on full repo, not only the active window as with Copilot or Jetbrains AI

_ink_ · 2024-02-28T10:38:44.000000Z

Check out the continue dev plugin (available for VS Code and Jetbrains). You can attach it to OpenAI or local models and it can consider files in your codebase. It has a @Codebase keyword, but so far I get better results in specifically pointing to the needed files.

arzke · 2024-02-28T10:28:21.000000Z

Have you tried using Copilot's @workspace command in the chat?

raajg · 2024-02-28T01:27:56.000000Z

looks promising, but after looking at the website I'm yearning to learn more about it! How does it compare to alternatives? What's the performance like? There isn't enough to push me to stop using ChatGPT and use this instead. Offline is good, but to get users at scale there has to be a compelling reason to shift. I don't think that offline capabilities are going to be enough to get significant number of users.

Another tip, I try out a new chat interface to LLMs almost every week and they're free to use initially. There isn't a compelling reason for me to spend $10 from the get to for a use case that I'm not sure about yet.

bradnickel · 2024-02-28T03:57:34.000000Z

The compelling reason to shift to local/decentralized AI is that all of compute will soon be AI and that means your entire existence will go into it. The question you should ask yourself is do you want everything about you being handled by Sam Altman, Google, Microsoft, etc? Do you want all of your compute dependent on them always being up and do you want to trust their security team with your life? Do you want to still be using closed/centralized/hosted AI when truly open AI surpasses all of them in performance and capability. If you have children or family, do you want them putting their entire lives in the hands of those folks.

Decentralized AI will eventually become p2p and swarmed and then the true power of agents and collaboration will soar via AI.

Anyway, excuse the soap box, but there are zero valid reasons for supporting and paying centralized keepers of AI that rarely share, collaborate or give back to the community that made what they have possible.

gverrilla · 2024-02-28T11:47:10.000000Z

> when truly open AI surpasses all of them in performance and capability.

Is this true? I've tried llama last year and it was not very helpful. GPT4 is already full of problems and I have to keep circumventing them, so using something less capable doesn't get me too excited.

FloorEgg · 2024-02-28T01:44:48.000000Z

Maybe this isn't for everyone, just the people who place a high value on privacy.

vunderba · 2024-02-28T04:36:18.000000Z

If your ultimate goal is privacy, then you should only be looking at open source chat UI front ends:

https://github.com/mckaywrigley/chatbot-ui

https://github.com/oobabooga/text-generation-webui

https://github.com/mudler/LocalAI

And then connecting them to off-line models servers:

- Ollama

- llama.cpp

And you should avoid closed source frontends:

- Recurse

- LM Studio

And closed source models

- ChatGPT

- Gemini

copperx · 2024-02-28T07:30:44.000000Z

Are you implying Claude is an open source model?

djfdat · 2024-03-01T01:05:45.000000Z

I don't think the list was meant to be exhaustive.

ukuina · 2024-02-28T02:05:52.000000Z

But how can I guarantee this app is private?

I'm assuming I cannot block internet access to the app because it needs to verify App Store entitlement.

giblfiz · 2024-02-28T02:25:36.000000Z

I mean, ok, then how do you distinguish yourself from LM Studio (Free)

rbtprograms · 2024-02-28T01:37:24.000000Z

Looks great! Does it support different sized models, i.e. can I run llama 70B and 7B, and is there a way to specify which model to chat with? Are there plans to allow users to ingest their own models through this UI?

xyc · 2024-02-28T02:17:38.000000Z

If you have a gguf file you can link it. For ingesting new models - I'm thinking about adding some CRUD UIs to it, but I'd like to keep a very small set of default models.

rbtprograms · 2024-02-28T07:12:26.000000Z

thanks, its a great project

bradnickel · 2024-02-28T03:29:52.000000Z

Love this! Just purchased. I am constantly harping on decentralized AI and love seeing power in simplicity.

Are you on Twitter, Threads, Farcast? Would like to tag you when I add you to my decentralized AI threads.

xyc · 2024-02-28T03:46:46.000000Z

Thank you so much for the support! Simplicity is power indeed. I'm on twitter: https://x.com/chxy

bradnickel · 2024-02-28T03:52:11.000000Z

Found your Twitter account in a previous post. Just tagged you.

xyc · 2024-02-28T05:19:06.000000Z

Awesome, thanks for the tag!

hanniabu · 2024-02-28T16:12:26.000000Z

What's your farcaster?

tkgally · 2024-02-28T01:30:13.000000Z

For an app like this, I would really like a spoken interface. Any possibility of adding text-to-speech and speech-to-text so that users can not only type but also talk with it?

xyc · 2024-02-28T06:02:56.000000Z

yes I wish it could talk. It's after other priorities though, but I might try something experimental.

bberenberg · 2024-02-28T12:33:26.000000Z

There are a lot of tools listed in this thread, but I am not seeing the thing I want which is:

- Ability to use local and OpenAI models (ideally it has defaults for common local models)

- Chat UX

- Where I can point it to my JS/TS codebase

- It indexes the whole thing including dependencies for RAG. Ideally indexing has some form of awareness of model context length.

- I can use it for codegen / debugging.

The closest I have found has been aider, but it's python and I get into general python hell every time I try and run it.

Would appreciate a suggestion.

bonestamp2 · 2024-02-28T18:58:08.000000Z

You will sell more if instead of telling us it's for "chatting with local AI" you tell us what we can accomplish by chatting with local AI. I don't need to chat, I need to get certain tasks done. What tasks can it do? (Don't answer me, put it on your landing page and app store listing)

xyc · 2024-02-28T03:40:19.000000Z

Wow, I did not expect at all this will end up on the front page. Thank you for all the enthusiasm, I'll try to get to more questions later today but if there's something I missed my X/twitter DM is open: https://x.com/chxy

castles · 2024-02-28T12:24:35.000000Z

It seems "local" is all you need :)

sneak · 2024-02-29T16:02:48.000000Z

You can’t buy these apps from the Apple store without providing identity to Apple in the form of a phone number (required for an Apple ID) and linking it to a hardware serial number (the app store app transmits this).

It would be nice to be able to buy the app directly from you, instead of putting a surveillance company in the loop. I don’t use an Apple ID on a macintosh.

(I would like to avoid using one on an iPhone/iPad/Vision Pro/AppleTV, but it is impossible to install apps on those without one. Please do not bring this terrible circumstance to be the default on macs, too.)

k2enemy · 2024-02-28T20:50:22.000000Z

It would be cool to have the option to use the OpenAI API as well in the same interface. http://jan.ai does this, so that's what I'm using at the moment.

SkepticMystic · 2024-02-28T05:23:10.000000Z

I've found great utility with `llm` https://llm.datasette.io, a CLI to interact with LLMs. It has plugins for remote and local models.

xyc · 2024-02-28T06:19:09.000000Z

Good to know. I've learned lots of things from Simon Willison's blog (datasette's author), so can't imagine llm being unuseful.

911e · 2024-02-28T08:39:05.000000Z

Not a bit of open code while I'm 100% sure they use some that require it. If you"re using AI + Your data without insight on how it's used you're a fool. 2 cents

pentagrama · 2024-02-28T02:47:21.000000Z

Congrats! Plans on Windows support?

xyc · 2024-02-28T05:35:49.000000Z

Thanks! Sorry no immediate plan. People have recommended Chat with RTX so it might be worth checking out. https://www.nvidia.com/en-us/ai-on-rtx/chat-with-rtx-generat...

appel · 2024-02-28T22:16:25.000000Z

It looks amazing, OP! I'm sad I'm missing out as a Windows user.

theolivenbaum · 2024-02-28T05:59:42.000000Z

You can try https://curiosity.ai, supports Windows and macOS

ferfumarma · 2024-02-28T13:41:19.000000Z

Is the haiku example a real Haiku?

I think it gives you 4, 7, and 9 syllables in the lines.

I bet you can coax it to give you a better example, if you tinker a bit.

sen · 2024-02-28T04:14:40.000000Z

This is awesome. I currently use Ollama with OpenWebUI but am a big fan of native apps so this is right up my alley.

woadwarrior01 · 2024-02-28T09:21:00.000000Z

It looks like an Electron app, and not a native app.

https://imgur.com/a/pz0kzJ1

xyc · 2024-02-28T05:39:20.000000Z

Thank you!

maxfurman · 2024-02-28T16:29:54.000000Z

Won't work on my Intel Macbook :-(

oxonia · 2024-03-02T05:49:28.000000Z

This is disappointing. Anything similar available for Intel Macs?

jedberg · 2024-02-28T16:47:46.000000Z

The app is great but honestly I'm impressed with the home page! Can you go into more details on how you made the home page? What did you use to make the screenshots, and are you using any tools to generate the HTML/CSS/etc?

xyc · 2024-02-28T19:50:57.000000Z

Thanks! honestly it's a quick hack together compared to the app. screenshots are from screen.studio. website is built with https://astro.build

mvdtnz · 2024-02-28T17:03:10.000000Z

Seriously? It grinds my phone to a near halt just trying to scroll from top to bottom. Worse in Firefox but still pretty bad in chrome.

jedberg · 2024-02-28T18:38:11.000000Z

Interesting. I was using my PC to view it and it was fast and beautiful.

android521 · 2024-02-28T03:06:20.000000Z

how big is the local model? what is the Mac spec requirement? I don't want to download and find out it won't work in my computer. It seems like the first question everyone would ask and should be addressed on the website.

xyc · 2024-02-28T05:41:13.000000Z

Appreciate the feedback! It works on mac with Apple Silicon only. I'll put some system requirements on the website.

visarga · 2024-02-28T05:16:01.000000Z

It uses ollama which is based on llama.cpp, and adds a model library with dozens of models in all quant sizes.

xyc · 2024-02-28T05:41:47.000000Z

no this doesn't use ollama, just based on llama.cpp.

konschubert · 2024-02-28T12:06:23.000000Z

I want something that starts as a simple manager for my reminders, something that tells me what to do next. And then, as features are being added, grows into a full-blown personal assistant that can book flights for me.

chaxor · 2024-02-28T16:08:15.000000Z

This looks fantastic on macos. I like the project.

What does this have that is better than https://github.com/open-webui/open-webui ?

howmayiannoyyou · 2024-02-28T13:41:02.000000Z

Without Apple Shortcuts support I can't pay for this. I get pretty much the same experience from GPT4All. Hoping you add support CLI, Shortcuts or something along those lines.

xyc · 2024-02-29T05:48:20.000000Z

Thanks for the suggestion! I play with Apple Shortcuts sometimes. It's an exemplary example of how easy end user programming could be. Will keep this in mind.

matthewmcg · 2024-02-28T17:39:30.000000Z

The headline had me thinking you had a DIY self-driving car for a moment there. Didn't initially register that this was just the common metaphor. Looks like a great app.

3abiton · 2024-02-28T01:43:53.000000Z

How different is this compared to Jan.ai for example?

xyc · 2024-02-28T02:01:56.000000Z

as i understand jan.ai is more focused on enterprise / platform, while I'd see where recursechat would go is more like "obsidian.md" but as your personal AI.

gexla · 2024-02-28T02:29:04.000000Z

Obsidian has add-ons which do much of this.

internetter · 2024-02-28T02:33:24.000000Z

People are treating Obsidian like it's the next Emacs

toomuchtodo · 2024-02-28T04:21:00.000000Z

Hey! This is awesome! How hard would it be to plug it into something like Raindrop.io (bookmark manager) to train on all bookmarks collected?

xyc · 2024-02-28T05:34:33.000000Z

haven't tried Raindrop.io, looks neat! Saw some other posts mentioning bookmarks as well. I'll keep this in thought, but will have to try it out first to find out.

toomuchtodo · 2024-02-28T14:34:18.000000Z

Appreciate it, thank you.

jiriro · 2024-02-28T09:23:30.000000Z

Out of curiosity – how is this app built?:-)

There is a demo clip with a vertical scroll bar which does not fade out as it would do in a native mac app:)

rangera · 2024-02-28T09:58:03.000000Z

Scroll bars don't fade out if you're using a mouse (as opposed to just a trackpad) or if you've set Mac OS Settings > Appearance > Show scroll bars to "Always".

jiriro · 2024-02-28T11:15:34.000000Z

I see! I’ve not used mouse on a mac:-o

Anyway the UI looks not mac native. I’m interested what it is:-)

Alifatisk · 2024-02-28T11:53:57.000000Z

Yeah I am curious what the app is built with. I saw someone mention it's using Electron, so that's a start.

SushiHippie · 2024-02-28T13:21:12.000000Z

Even with a screenshot

https://news.ycombinator.com/item?id=39535755

stuckkeys · 2024-02-28T12:13:12.000000Z

No iPhone app? Assuming it looks to connect to a local server or are you actually downloading the llms local to the device?

famahar · 2024-02-28T10:23:45.000000Z

Will this work on an M1 Mac Book Air? Looking for an offline solution like this but wary of hardware requirements.

boringg · 2024-02-28T14:44:01.000000Z

This looks interesting -- might implement it. I'm curious how to ensure that it is local only?

xyst · 2024-02-28T16:29:21.000000Z

I'll give it a shot. Appreciate the effort on keeping it local.

geniium · 2024-02-28T07:03:14.000000Z

I am very glad to see that kind of app. Well done!

machiaweliczny · 2024-02-28T14:55:38.000000Z

Will it work fine on Macbook Air M2 16GB ?

gnomodromo · 2024-02-28T14:53:40.000000Z

I wonder how much space it takes.

zzz999 · 2024-02-28T09:56:39.000000Z

Any censorship?

(Can't try MacOS Apps)

surrTurr · 2024-02-28T08:12:05.000000Z

any plans on supporting ollama integration?

pentagrama · 2024-02-28T03:11:28.000000Z

Sadly I can't try this because I'm on Windows or Linux.

Was testing apps like this if anyone is interested:

Best / Easy to use:

- https://lmstudio.ai

- https://msty.app

- https://jan.ai

More complex / Unpolished UI:

- https://gpt4all.io

- https://pinokio.computer

- https://www.nvidia.com/en-us/ai-on-rtx/chat-with-rtx-generat...

- https://github.com/LostRuins/koboldcpp

Misc:

- https://faraday.dev (AI Characters):

No UI / Command line (not for me):

- https://ollama.com

- https://privategpt.dev

- https://serge.chat

- https://github.com/Mozilla-Ocho/llamafile

Pending to check:

- https://recurse.chat

Feel free to recommend more!

wanderingmind · 2024-02-28T07:59:10.000000Z

lmstudio is using a dark pattern I really hate. Don't have a Github logo in your webpage if your software is not source available. It just takes to Github to some random config repos they have. This is poor choice in my opinion.