I've been looking around at open source offerings that can run on "affordable" h...

smoldesu · on March 10, 2023

You can run GPT-J on an M1 or Pi if you want, you'll just have to settle for a pruned model. No self-hosted options (besides maybe Facebook's leaked thing) can stand up to ChatGPT's consistency or size. I've also been playing with this and have had great results for non-interactive use on even the smallest models (125m params, ~2gb RAM). The problem as I see it won't be inference time/acceleration as much as it will be having enough memory to load the model, and settling for lower quality answers. ChatGPT is already pretty delusional, and pruning it doesn't make it any smarter. You can practically feel the missing connections in the quality of response.

So, big takeaway: AI text generation is legible and quick with smaller models, but "intelligence" a-la ChatGPT seems to scale directly with memory.

all2 · on March 10, 2023

Could we do two models that are different levels of abstraction? One for "ideas" and one for compressing the idea into words? I've been thinking that smaller specialized networks might boost space efficiency.

I've no clue how these would be wired together, but I have done ideas.

satvikpendem · on March 15, 2023

Have you seen the latest Llama and Alpaca models?