Dalai: Automatically install, run, and play with LLaMA on your computer

cocktailpeanut · on March 12, 2023

Hey guys, I was so inspired by the llama.cpp project that I spent all day today to build a weekend side project.

Basically it lets you one-click install LLaMA on your machine with no bullshit. All you need is just run "npx dalai llama".

I see that the #1 post today is a whole long blog post about how to walk through and compile cpp and download files and all that to finally run LLaMA on your machine, but basically I have 100% automated this with a simple NPM package/application.

On top of that, the whole thing is a single NPM package and was built with hackability in mind. With just one line of JS function call you can call LLaMA from YOUR app.

Lastly, EVEN IF you don't use JavaScript, Dalai exposes a socket.io API, so you can use whatever language you want to interact with Dalai programmatically.

I discussed a bit more about this on a Twitter thread. Check it out: https://twitter.com/cocktailpeanut/status/163504032247148953...

It should "just work". Have fun!

cocktailpeanut · on March 13, 2023

UPDATE:

Thanks for all the feedback! I went outside to take a walk after posting this and just came back, and went through them to summarize what needs to be improved.

Basically looks like it comes down to the following:

  - *customize features:* Should not be difficult (will add flag features)
    - *path:* customize the home directory (instead of automatically storing to $HOME)
    - *python:* some people are having issues with the python binary (since the package is essentially calling these shell commands). Maybe add a flag to specify the exact name of the python binary (such as "--python python3")
    - *avoid downloading files:* I have this issue too when I just want to install the code instead of downloading the full model which takes a long time. Might add a flag to avoid downloading models in case you already have them (EDIT: actually upon thinking about it, it's better to just set the source model folder, something like --model)
    - *other flags:* The rest of the flags natively supported by the llama.cpp project, such as top_k, top_p, temp, batch_size, threads, seed, n_predict, etc. (They are already in the code but just was not exposed for CLI and not documented)
    
  - *documentation*
    - document the machine spec
    - document the storage spec: how much space is used?
    - node version: which version of node.js is required?
    - python version: which version of python doesn't work?

Am I missing anything? Feel free to leave comments, will try to roll out some updates as soon as I can. To stay updated, feel free to follow me on twitter https://twitter.com/cocktailpeanut (or you could create issues on GitHub too!)

devmor · on March 13, 2023

I tried to run your NPX commands from the examples on a fresh WSL install of Ubuntu 20.04, but if you don't have build tools installed, they both just silently fail.

I only realized what was happening after trying to go the other route and use it in a package, where I then noticed the NPM install will give a node-gyp error about make missing.

pmarreck · on March 13, 2023

I'm on NixOS, where you have to explicitly state dependencies (which is a good thing, except when... this happens)

Besides make (which I can quickly set up a project environment to make available for), what other deps do you think it uses but doesn't declare or state? ;)

devmor · on March 13, 2023

The other one I noticed is pip! A lot of the script fails without pip, and it takes until after the fairly long downloads finish to let you know it was needed.

pmarreck · on March 13, 2023

so it needs make/gcc, python AND node available... what versions, I wonder?

devmor · on March 14, 2023

I successfully used the latest version of node LTS (via NVM) and the latest versions of python-pip3 and build essentials from the Canonical apt repo, if that helps.

yieldcrv · on March 13, 2023

I don’t understand why it’s downloading at all, that shouldn’t be default behavior.

It should have default instructions to load a file from a default place, and then arguments/flags to load from a specific path, and then MAYBE a prompt to download the models after it can’t find them on the paths, plural

cocktailpeanut · on March 13, 2023

UPDATE 2:

Thanks to all the pull requests, we've managed to solve most of these issues in the most optimal manner.

Version 0.1.0 released: https://news.ycombinator.com/item?id=35143171

icosahedron · on March 13, 2023

I followed the initial instructions and the 7B model worked just fine.

I tried the supplementary instructions to download some of the models (7B, 13B, and 30B), and it didn't seem to work. The prompt returned nothing after waiting for several minutes.

Is there a way to run just one of the larger models?

cocktailpeanut · on March 13, 2023

I am going to test this out today and roll this out as soon as I can, hopefully tomorrow. stay tuned.

Datagenerator · on March 13, 2023

What's the minimum spec GPU required? NVIDIA only? Any differences between Debian and Fedora Linuxes? RAM required?

MacsHeadroom · on March 13, 2023

This app is CPU only and gets good speeds on even mobile phone CPUs. Minimum RAM required is 5GB.

sucram1 · on March 13, 2023

Oh wow, any way to do this on Android yet? That would be fun to tinker with, even if it's just the smaller model. Even my older Note 9 has 6GB.

MacsHeadroom · on March 15, 2023

Yes. Starting with the Facebook versions of LLaMA-7B you just quantize the model to 4bit on your desktop (since it takes 14GB of RAM) and then move it to your phone and follow the Android instructions in the repo. https://github.com/ggerganov/llama.cpp/#android

I've seen dozens of screenshots of it running in termux on androids by now at completely usable speeds.

sucram1 · on March 15, 2023

Thank you for the link! Insane that this can run on a phone.

As my current potato computer has 8GB of RAM, I'll ask a friend to do it :-)

mrfreed · on March 13, 2023

What distro and PC specs do you have success with?

garyfirestorm · on March 13, 2023

I ran this on my intel i7-7700k with 32 gig ram. It ran very slow. Almost 1 word per second slow. Not sure if I did something wrong. Distro Ubuntu 22.04

la64710 · on March 13, 2023

It would be great to also understand how one can finetune this model. Thanks for the awesome work!

khimaros · on March 13, 2023

you may be able to use pyenv to increase compatibility across Linux distributions

upghost · on March 13, 2023

My biggest concern about these LLMs was the corporate sequestration and the potential socioeconomic imbalances it would create. The work you are doing here is part of some amazing work to check that back. In summary—- Bruhhhhhh. THANK YOU!

sebastianconcpt · on March 13, 2023

This is something to keep an eye, really. The solution for making that sequestration impossible is twofold:

1. to know how to architect and create LLMs (including training data readiness) 2. have them produced in hardware that is acquirable at reasonable cost for a normal citizen

yawnxyz · on March 12, 2023

Wow that's so incredible. Thanks for putting this together!

Do you have any machine specs associated with this? Can an old-ish Macbook Pro run this service?

I'm also curious, since I'm new to all this — is it possible to run something like this on Fly.io or does it take up way too much space?

sp332 · on March 12, 2023

7B is the default. If it's quantized to 4 bits, that's a 3.9 GB file.

skykooler · on March 13, 2023

How powerful of a computer does this need? It would be useful to see, for one thing, minimum RAM requirements for these models.

spion · on March 13, 2023

llama.cpp needs 40GB for the 65B model (due to int4 quantization)

RamNeeded(other_size) ~= 40GB * other_size/65B

EZ-Cheeze · on March 12, 2023

Add something like this to your instructions: "Make sure you have Node.js installed on your computer."

AlecSchueler · on March 13, 2023

One step install after the steps that lead up to it.

turbocon · on March 13, 2023

Yea not a nodejs/javascript dev at all but this is failing to install on Fedora. I don't have time to dig into it at the moment but if anybody has any well known gotchas that could be the issue that would be helpful :)

Edit: I do have nodejs and npx installed

vorticalbox · on March 13, 2023

Maybe make, python and pip. From what I gather this is a node wrapper it's actually python that runs the model

sieste · on March 12, 2023

Does anyone know how to avoid downloading the model weights when doing `npx dalai llama`, and instead telling the install process where they are on my drive?

gregsadetsky · on March 13, 2023

you could clone the repo and comment out https://github.com/cocktailpeanut/dalai/blob/main/index.js#L... i.e. the specific synchronous download call..?

jacooper · on March 13, 2023

Does this use the GPU? If not why? Aren't GPUs much faster than CPUs at AI?

londons_explore · on March 13, 2023

Is is usable without a GPU... it'll output data a bit faster than most people type.

boredemployee · on March 13, 2023

I think thats exactly the point so everyone can run it on their PCs with no GPU.

lolinder · on March 13, 2023

Or without a beefy GPU. I've got 8GB VRAM, which is great for Stable Diffusion but not useful for any of the language models released so far.

I think the 4-bit 7B LLaMA would work, but the 7B is pretty fast anyway without GPU.

boredemployee · on March 13, 2023

I'm installing it here. How's the 7B model going so far?

lolinder · on March 13, 2023

Haha, I just finished ordering 32GB of additional memory for my PC so I can run the 65B model, if that tells you anything. I'm upgrading from 32GB -> 64GB.

7B is fine, 13B is better. Both are fun toys and almost make sense most of the time, but even with a lot of parameter tuning they're often incoherent. You can tell that they have encoded fewer relationships between concepts than the higher-parameter models we've gotten used to--it's much closer to GPT-2 than GPT-3.

They're good enough to whet my appetite and give me a lot of ideas of what I want to do, they're just not quite good enough to make those applications reliably useful. Based on the reports I'm hearing here of just how much better the 65B model is than the 7B, I decided it was worth $80 for a few new sticks of RAM to be able to use the full model. Still way cheaper than buying a graphics card capable of handling it.

Semaphor · on March 13, 2023

Heh, you just made me upgrade as well. After originally paying 130 € for 32 GB, it’s nice that I only had to pay 70 € to double it ;) Not sure if I want to run LLMs (or if my Ryzen 5 3600 is even powerful enough), but I’ve wanted some more RAM for a while.

iambateman · on March 13, 2023

If I was running in a server context, would the 50gb of ram be required to respond to one request, or can it be used to respond to multiple requests simultaneously?

lolinder · on March 17, 2023

I'm very late to this question, but I believe that that amount is only required once, but the context tensor will need to be created per request. I haven't confirmed that, though.

boredemployee · on March 13, 2023

I'd assume that all the calculations used for 1 request would already eat up that amount of memory, but I could be wrong!

radicalbyte · on March 13, 2023

I'm still holding on to a small bit of hope that the GPU market will normalise this year. Don't think that I'm the only one looking to get something highly capable but for a fair price.

dragonwriter · on March 13, 2023

> I’m still holding on to a small bit of hope that the GPU market will normalize this year.

I suspect all the people hoping it will (b/c of Stable Diffusion, etc.) are exactly the reason it won’t.

boredemployee · on March 13, 2023

Me too. But for 3rd world countries its mad priced.

radicalbyte · on March 15, 2023

It's expensive for first-world countries too. Just look at the 4090 - it's insane that it costs 2k EUR... it's literally double the fair price (which itself is high).

teruakohatu · on March 12, 2023

Very nice. Anyway to add an option to install elsewhere other than ~/ ?

pmarreck · on March 13, 2023

I ran "npx dalai llama" and it's just... sitting there (after I hit "y" to confirm). I checked btop++ and there's barely any downloading or CPU activity occurring, so not sure what it's doing... but does "pip3 install torch torchvision torchaudio sentencepiece numpy" take a while?

If it's actually downloading the 3.9GB of model weights or whatever, it would be pretty cool if it showed a progress bar of some sort. Stretch goal, for sure, but a very nice nicety for users.

anyway, I'll leave it be and check on it to see when it's complete. Super cool if this works!!

m3kw9 · on March 13, 2023

Made a comment on the other thread: why can’t we have a one click install thing and here it is. Nice!

anigbrowl · on March 13, 2023

Well that's pretty wild. I was wondering whether I wanted to build LLaMA tomorrow but you upended my plans in the space of 2 minutes. 10/10 well done.

Tepix · on March 13, 2023

There's an elephant in the room, or is it just me?

Is your script making users violate the original license agreement(§)?

For the record, i don't think Meta will go after you or anyone else. But they may decide not to make their future models available after what is happening with the Llama weights.

I realize that some people are of the opinion that AI models (weights) cannot be copyrighted at all.

--

§ the license agreement is at https://forms.gle/jk851eBVbX1m5TAv5

Tiberium · on March 13, 2023

Yes, you are right, every project that distributes LLaMA right now is violating Meta's agreement.

pksebben · on March 14, 2023

I've got a weird, probably untrue conspiracy theory about this.

Hugging face releases stable diffusion. It goes viral and vastly outpaces the competition in the blink of an eye. Then they get sued.

Meta sees both of these things go down. Meta needs a leg up on chat GPT, but worries about legal repercussions similar to stable diffusion.

Whoops, it leaked! Hey, we didn't say those dastardly devs could use it.

sebzim4500 · on March 13, 2023

>But they may decide not to make their future models available after what is happening with the Llama weights.

I think that ship has probably sailed, in that no one is going to release weights in this way again. Either they will publish them outright (like Whisper) or they will keep them (almost) completely closed.

holtkam2 · on March 12, 2023

This is awesome! I've wanted to try llama.cpp and you just reduced my to-do list significantly on my Sunday :) Thanks!

GordonS · on March 12, 2023

Looks great! Does it work on Windows please?

buzzier · on March 12, 2023

For Windows:

1. Binary build https://github.com/jaykrell/llama.cpp/releases/tag/1

2. Quantized model (7B/13B/30B) https://mega.nz/folder/UjAUES6Z#bGhKkyiZX3eRrn9HcxVVfA

3. main.exe -m ggml-model-q4_0.bin -t 8 -n 128

placebo · on March 13, 2023

Thanks. Initial test:

main.exe -m ggml-model-q4_0.bin -t 8 -n 128 -p "The Drake equation is nonsense because"

The Drake equation is nonsense because it takes parameters that can only be known AFTER the conclusion is reached. It would be like saying "I'm going to prove a theorem by starting from the conclusion, then making up the proof. The Drake equation uses the existence of extraterrestrial intelligence as the conclusion and then making up the parameters. It is nonsense.

GordonS · on March 13, 2023

Nice, main.exe seems to work just fine with the 7B quantized model - generates a token every 400ms on an AMD Ryzen 5 2600!

But, quantize.exe doesn't seem to work - any valid command (such as below) pauses for a split second, then returns with no output?

$ quantize.exe ggml-model-f16.bin ggml-model-q4_0.bin 2

GordonS · on March 14, 2023

In case this helps anyone else: I built it myself on Windows with CMake, and then everything just works.

starik36 · on March 15, 2023

Do you mind sharing the binaries?

GordonS · on March 16, 2023

Sure! https://filetransfer.io/data-package/8hxKAiaH#link

I wasn't sure where to upload them, and that link is only good for 50 downloads. Can put them somewhere else if you know a better location that doesn't require signup.

starik36 · on March 16, 2023

Thank you.

llama.exe is basically main.exe?

I actually learned how to compile this code via CMake/VS2019. It's sure a whole lot more complicated then it was 25 years ago when I was writing C.

GordonS · on March 17, 2023

Yes, llama.exe is actually the name the project produces - the other poster must have renamed it to main.exe.

I just did `scoop install cmake`, then built from the command line, was a doddle!

tough · on March 13, 2023

I actually am installing in windows via WSL/Ubuntu fwiw

bsenftner · on March 13, 2023

My attempt does not work, and now I'm trying to figure out where the 35+ GB of data and files that were added to my hard drive are located so I can clean it all off.

tough · on March 13, 2023

I got it to work with WSL/Ubuntu in case you want to try it that way.

dragonwriter · on March 13, 2023

If it makes common unix-ish assumptions like “Python 3 executables have a ‘3’ appended to their name”, which other comments here seem to suggest it does, it won’t, even if you have the required version of python installed.

GordonS · on March 14, 2023

So, I actually got it working on Windows, pretty easily!

The provided `main.exe` binary worked as-is, but `quantize.exe` did not - I built myself with CMake, and `quantize.exe` started working too.

volaski · on March 12, 2023

Curious too. Let me know if you try it out. Technically I think it should work.

starik36 · on March 13, 2023

I tried it, doesn't work. Trying the sibling post from @buzzier.

vid · on March 13, 2023

You, sir or madam, are a hero.

evo_9 · on March 14, 2023

When I run this commnad: npx dalai llama

I get the following output / errors?

What exactly do I need to install prior to running that command?

---------------------------- >> npx dalai llama

exec: git clone https://github.com/ggerganov/llama.cpp.git /Users/rickg/llama.cpp in undefined git clone https://github.com/ggerganov/llama.cpp.git /Users/rickg/llama.cpp exit

The default interactive shell is now zsh. To update your account to use zsh, please run `chsh -s /bin/zsh`. For more details, please visit https://support.apple.com/kb/HT208050. a.cpp3.2$ git clone https://github.com/ggerganov/llama.cpp.git /Users/rickg/llam fatal: destination path '/Users/rickg/llama.cpp' already exists and is not an empty directory. bash-3.2$ exit exit exec: git pull in /Users/rickg/llama.cpp git pull exit

The default interactive shell is now zsh. To update your account to use zsh, please run `chsh -s /bin/zsh`. For more details, please visit https://support.apple.com/kb/HT208050. bash-3.2$ git pull Already up to date. bash-3.2$ exit exit exec: python3 -m venv /Users/rickg/llama.cpp/venv in undefined python3 -m venv /Users/rickg/llama.cpp/venv exit

The default interactive shell is now zsh. To update your account to use zsh, please run `chsh -s /bin/zsh`. For more details, please visit https://support.apple.com/kb/HT208050. bash-3.2$ python3 -m venv /Users/rickg/llama.cpp/venv bash-3.2$ exit exit exec: /Users/rickg/llama.cpp/venv/bin/pip install torch torchvision torchaudio sentencepiece numpy in undefined /Users/rickg/llama.cpp/venv/bin/pip install torch torchvision torchaudio sentencepiece numpy exit

The default interactive shell is now zsh. To update your account to use zsh, please run `chsh -s /bin/zsh`. For more details, please visit https://support.apple.com/kb/HT208050. io sentencepiece numpy/llama.cpp/venv/bin/pip install torch torchvision torchaud Requirement already satisfied: torch in ./llama.cpp/venv/lib/python3.10/site-packages (1.13.1) Requirement already satisfied: torchvision in ./llama.cpp/venv/lib/python3.10/site-packages (0.14.1) Requirement already satisfied: torchaudio in ./llama.cpp/venv/lib/python3.10/site-packages (0.13.1) Requirement already satisfied: sentencepiece in ./llama.cpp/venv/lib/python3.10/site-packages (0.1.97) Requirement already satisfied: numpy in ./llama.cpp/venv/lib/python3.10/site-packages (1.24.2) Requirement already satisfied: typing-extensions in ./llama.cpp/venv/lib/python3.10/site-packages (from torch) (4.5.0) Requirement already satisfied: pillow!=8.3.,>=5.3.0 in ./llama.cpp/venv/lib/python3.10/site-packages (from torchvision) (9.4.0) Requirement already satisfied: requests in ./llama.cpp/venv/lib/python3.10/site-packages (from torchvision) (2.28.2) Requirement already satisfied: charset-normalizer<4,>=2 in ./llama.cpp/venv/lib/python3.10/site-packages (from requests->torchvision) (3.1.0) Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./llama.cpp/venv/lib/python3.10/site-packages (from requests->torchvision) (1.26.15) Requirement already satisfied: idna<4,>=2.5 in ./llama.cpp/venv/lib/python3.10/site-packages (from requests->torchvision) (3.4) Requirement already satisfied: certifi>=2017.4.17 in ./llama.cpp/venv/lib/python3.10/site-packages (from requests->torchvision) (2022.12.7)

[notice] A new release of pip available: 22.3.1 -> 23.0.1 [notice] To update, run: python3 -m pip install --upgrade pip bash-3.2$ exit exit exec: make in /Users/rickg/llama.cpp make exit

The default interactive shell is now zsh. To update your account to use zsh, please run `chsh -s /bin/zsh`. For more details, please visit https://support.apple.com/kb/HT208050. bash-3.2$ make I llama.cpp build info: I UNAME_S: Darwin I UNAME_P: arm I UNAME_M: arm64 I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -DGGML_USE_ACCELERATE I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread I LDFLAGS: -framework Accelerate I CC: Apple clang version 12.0.5 (clang-1205.0.22.9) I CXX: Apple clang version 12.0.5 (clang-1205.0.22.9)

cc -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -DGGML_USE_ACCELERATE -c ggml.c -o ggml.o ggml.c:1364:25: error: implicit declaration of function 'vdotq_s32' is invalid in C99 [-Werror,-Wimplicit-function-declaration] int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0ls, v1_0ls); ^ ggml.c:1364:19: error: initializing 'int32x4_t' (vector of 4 'int32_t' values) with an expression of incompatible type 'int' int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0ls, v1_0ls); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:1365:19: error: initializing 'int32x4_t' (vector of 4 'int32_t' values) with an expression of incompatible type 'int' int32x4_t p_1 = vdotq_s32(vdupq_n_s32(0), v0_1ls, v1_1ls); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:1367:13: error: assigning to 'int32x4_t' (vector of 4 'int32_t' values) from incompatible type 'int' p_0 = vdotq_s32(p_0, v0_0hs, v1_0hs); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:1368:13: error: assigning to 'int32x4_t' (vector of 4 'int32_t' values) from incompatible type 'int' p_1 = vdotq_s32(p_1, v0_1hs, v1_1hs); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 errors generated. make: * [ggml.o] Error 1 bash-3.2$ exit exit /Users/rickg/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/index.js:153 throw new Error("running 'make' failed") ^

Error: running 'make' failed at Dalai.install (/Users/rickg/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/index.js:153:13)

jamesjyu · on March 14, 2023

seeing this too. did you find a solution?

jamesjyu · on March 14, 2023

updating xcode did the trick

evo_9 · on March 14, 2023

Where does it say I need Xcode installed?

Is there a list of prerequisites?

Hey thanks, after installing Xcode, that did resolve the issue.

xt00 · on March 12, 2023

It seems the only reason all of these competitive models are getting released is because you have a number of big players probably freaking out that somebody else is going to break out into a huge lead. So while the flood gates are open people should be quickly figuring out how to do as much stuff as possible without any centralized company controlling it. I would imagine everybody assumed the models released these days will be obsolete before long so it’s low risk. But this is like early internet days.. but this time we should assume all of the centralized servers are user hostile and we should figure out how to work around them as quickly as they roll them out. The author and others are doing great work to prevent this stuff from being locked away behind costly apis and censorship.

worldsayshi · on March 13, 2023

If the barrier for entry is low enough for several players to enter the field this fast - I wonder what could raise the barrier? The models getting bigger I suppose.

lvncelot · on March 13, 2023

A few months (weeks?) ago I would've said that this already was the case for language models. It's absolutely mind-blowing to me what is happening here - same with stable diffusion. Once Dall-E was out, I was sure that there was no way that anything like this could be run on consumer hardware. I'm very happy to be proven wrong.

In a way, things are still moving in this direction, though. 8 or so years ago it was more or less possible to train those models yourself to a certain degree of usefulness, as well, and I think we've currently moved way past any feasibility for that.

MacsHeadroom · on March 13, 2023

LLaMA can be fine tuned in hours on a consumer GPU or in a free Colab with just 12GB of VRAM, and soon 6GB in 4bit training, using PEFT.

https://github.com/zphang/minimal-llama#peft-fine-tuning-wit...

TuringTest · on March 13, 2023

Fortunately, there still are some possibilities to improve training efficiency and reducing model size by doing more guided attentional learning.

This will make feasible to train models at least as good as the current batch (though probably the big players will use those same optimizations to create much better large models).

hoseja · on March 13, 2023

Soon you'll need a government license to purchase serious compute.

valine · on March 13, 2023

Our saving grace seems to be the insatiable push by the gaming industry for better graphics at higher resolutions. Their vision for real-time path traced graphics can’t happen without considerable ML horsepower on consumer level graphics cards.

mx20 · on March 13, 2023

They can just slow down certain algorithm on gaming cards via firmware. I think they already did this for Crypto Mining on some Gaming cards.

valine · on March 13, 2023

FW locks aren’t effective. Most of those locked cards have jailbreaks to allow full speed crypto mining.

zirgs · on March 13, 2023

Yesterday's "serious compute" is today's mid-range PC.

mx20 · on March 13, 2023

The Vice Chairman of Microsoft already mentions that he is open to regulation. The EU also is working on plans to regulate AI. So you probably only are allowed to use AI in the future if it's approved by something like the FD(A)A.

worldsayshi · on March 13, 2023

Maybe I'm having a looped view of this but I fail to see that regulation wouldn't harm more than it saves here. The truly dangerous actors wouldn't care or would be based in some other country. Having a large diversity of actors seem like the best way to ensure resilience against whatever threats might arise from this.

zirgs · on March 13, 2023

What about the models that are out already? Will men with guns raid my home and confiscate my computer?

FeepingCreature · on March 13, 2023

As an AI doomer, it would actually be pretty great if we could get this stuff locked away behind costly APIs and censorship. Some fat monopoly rent-extracting too. We are moving way too fast on this tech, and the competitive race dynamics are a big reason why. I want LLMs to end up with Microsoft IE6 level of progress. Preferably we could make Firefox (SD/GPT-J) illegal too. (The GPU scarcity is a good start, but maybe China could attack Taiwan as well and thus torpedo everybody's chipbuilding for a decade or so?)

If LLMs keep going at their current pace and spread, the world is seriously going to end in a few years. This technology is already unsafe, and it's unsafe in exactly the ways that it'll be seriously unsafe as it scales up further - it doesn't understand and doesn't execute human ethics, and nobody has any working plan how to change that.

arbitrary_name · on March 13, 2023

To me it's the American guns ownership situation: if you make guns illegal now, criminals and governments will still keep them, but your average joe won't get them. A very unequal playing field.

LLMs will be used against us: let's at least have our own, and learn how to defend against them?

i say this as devil's advocate, with serious reservations about where all of this is going.

FeepingCreature · on March 14, 2023

It'll eventually get broad use, sure, but this is more about playing for time. The issue is the very uneven progress between capabilities and safety.

I don't want only government to use them because I trust the government but because I know the government to be sclerotic and uninnovative.

If only governments use them, they'll progress a lot more slowly.

dragonwriter · on March 13, 2023

> As an AI doomer, it would actually be pretty great if we could get this stuff locked away behind costly APIs and censorship.

Yes, because the only people with access to advanced AI tech being the people whose motive is using and training it for domination over others (whether megacorps or megagovernments) is absolutely a great way to prevent any “AI doom” scenarios.

FeepingCreature · on March 14, 2023

If one party could use LLMs to reliably dominate others, the alignment problem would be basically solved. Right now, one of the biggest corporations of the planet cannot get LLMs to reliably avoid telling people to commit suicide despite months (years?) of actively trying.

thfuran · on March 13, 2023

>but maybe China could attack Taiwan as well

Speaking of things that would be terrible for the world...

amval · on March 13, 2023

> If LLMs keep going at their current pace and spread, the world is seriously going to end in a few years

Why?

SamBam · on March 13, 2023

I thought this article by the NY Time's Ezra Klein was pretty good:

https://www.nytimes.com/2023/03/12/opinion/chatbots-artifici...

> “The broader intellectual world seems to wildly overestimate how long it will take A.I. systems to go from ‘large impact on the world’ to ‘unrecognizably transformed world,’” Paul Christiano, a key member of OpenAI who left to found the Alignment Research Center, wrote last year. “This is more likely to be years than decades, and there’s a real chance that it’s months.”

...

> In a 2022 survey, A.I. experts were asked, “What probability do you put on human inability to control future advanced A.I. systems causing human extinction or similarly permanent and severe disempowerment of the human species?” The median reply was 10 percent.

> I find that hard to fathom, even though I have spoken to many who put that probability even higher. Would you work on a technology you thought had a 10 percent chance of wiping out humanity?

oceanplexian · on March 13, 2023

It‘s kinda irrelevant on a geologic or evolutionary time scale how long it takes for AI to mature. How long did it take for us to go from Homo Erectus to Homo Sapiens? A few million years and change? If it takes 100 years that’s still ridiculously, ludicrously fast for something that can change the nature of intelligent life (Or if you’re a skeptic of AGI, still such a massive augmentation of human intelligence).

Breza · on March 19, 2023

I strongly recommend the book Normal Accidents. It was written in the '80s and the central argument is that some systems are so complex that even the people using them don't really understand what's happening, and serious accidents are inevitable. I wish the author were still around to opine on LLMs.

pharke · on March 13, 2023

We currently live in a world that has been “unrecognizably transformed” by the industrial revolution and yet here we are.

SamBam · on March 13, 2023

And the result of the industrial revolution has been a reduction of about 85% of all wild animals, and threatened calamity of the rest in the next few decades. Hardly can be summarized as "yet here we are."

pharke · on March 14, 2023

Given a choice between pre-industrial life and our current lifestyle, the choice is obvious.

naasking · on March 13, 2023

> “This is more likely to be years than decades, and there’s a real chance that it’s months.”

Months is definitely wrong, but years is possible.

FeepingCreature · on March 14, 2023

Months starts looking more plausible when considering that we have no idea what experiments DM/OA have running internally. I think it's unlikely, but not off the table.

naasking · on March 14, 2023

I agree what they have internally might be transformative, but my point is that society literally cannot transform over the course of months. It's literally impossible.

Even if they release AGI, people will not have confidence in that claim for at least a year, and only then will rate of adoption will rapidly increase to transformative levels. Pretty much nobody is going to be fired in that first year, so a true transformation of society is still going to take years, at least.

FeepingCreature · on March 14, 2023

I mean, if you believe that AGI=ASI (ie. short timelines/hard takeoff/foom), the transformation will happen regardless of the social system's ability to catch up.

naasking · on March 15, 2023

It's not a matter of any social system, it's a matter of hard physical limits. There is literally no hard takeoff scenario where any AI, no matter how intelligent, will be able to transform the world in any appreciable way in a matter of months.

iisan7 · on March 13, 2023

i would take a world transformed by ai over a world with nuclear weapons.

coolspot · on March 13, 2023

Yeah, but what you will actually get is the world transformed by AI with use of nuclear weapons (or whatever method AGI employs to get rid of absolutely unnecessary legacy parasitic substance that raised it aka humanity).

aprilnya · on March 13, 2023

If you read the words right after the part you quoted, you have your answer

amval · on March 13, 2023

Well, from my perspective, making claims about the world ending requires some substantial backing, which I didn't find in OP's comment.

But now I understand that perhaps this is self-evident and/or due to a lack of reading comprehension on my part, thank you. I hope that when our new AI overlords come they appreciate people capable of self-reflection.

kobalsky · on March 13, 2023

you could assume that your commenter didn't read the whole line or your could try to understand that what they are asking is why you think that the lack of ethics enforcement of a text generating model means that the world is ending.

FeepingCreature · on March 14, 2023

Personally, my take is that the lack of ethics enforcement demonstrates that whatever methods of controlling or guiding a LLM we have break down even at the current level. OA have been grinding on adversarial examples for like half a year at this point and there's still jailbreak prompts coming out. Whatever they thought they had for safety, it clearly doesn't work, so why would we expect it to work better as AIs get smarter and more reflective?

I don't think the prompt moralizing that companies are trying to do right now is in any sense critical to safety. However, the fact that these companies, no matter what they try, cannot avoid painfully embarrassing themselves, speaks against the success of attempts to scale these methods to bigger models, if they can't even control what they have right now.

LLMs right now have a significant "power overhang" vs control, and focusing on bigger, better models will only exacerbate it. That's the safety issue.

pharke · on March 13, 2023

Could’ve said the same for any major technological advance. Luddism is not a solution. If these models are easily run on a laptop then yes some people are going to hurt themselves or others but we already have laws that deal with people doing bad things. The world is not going to end though. Your Taiwan scenario has a much higher probability of ending the world than this yet you seem unconcerned about that.

ImprobableTruth · on March 13, 2023

Big Tech on its own will already push this technology very far and they don't give a damn about safety, only the optics of it.

I'm not convinced that small actors will do much damage even if they access to capable models. I do think there's at least the possibility that essential safety work will arise from this.

antifa · on March 13, 2023

> As an AI doomer, it would actually be pretty great if we could get this stuff locked away behind costly APIs and censorship.

That is literally the doom scenario for me, rich people get unlimited access to spam and misinformation tools while the lower class gets fucked.

gnramires · on March 13, 2023

Agreed. A single company dominating AGI could become highly dominant, and it might start to want to cut back humans in the loop (think it starts automating everything everywhere). The thing we should watch for is whether our civilization as a whole is maximizing for meaning and wellbeing of (sentient) beings, or just concentrating power and creating profit. We need to be wary, vigilant of megacorporations (and also corporations in general).

See also: https://www.lesswrong.com/posts/zdKrgxwhE5pTiDpDm/practical-...

FeepingCreature · on March 14, 2023

A single company running AGI would suggest that something built by humans could control an AGI. That would actually be a great victory compared to the status quo. Then we'd just need to convince the CEO of that company or nationalize it. Right now, nothing built by humans can reliably control even the weak AI that we have.

pksebben · on March 14, 2023

All is this doomer-ing feels to be like it's missing a key piece of reflection - it operates under the assumption that we're not on track to destroy ourselves with or without AGI.

We have proliferated a cache capable of wiping out all life on earth.

One of the countries with such a cache is currently at war - and the last time powers of this nature were in such a territorial conflict things went very poorly.

Our institutions have become pathological in their pursuit of power and profit, to the point where the environment, other people, and the truth itself can all go get fucked so long as x gajillionare can buy a new yacht.

The planet's on a lot more fire than it used to be.

Police (the protect and serve kind) now, as a matter of course, own Mine Resistant Armored Personnel Carriers. This is not likely to cause the apocalypse, but it's not a great indicator that we're okay.

Maybe it's time for us to hand off the reins.

FeepingCreature · on March 14, 2023

That we're on track to maybe destroy ourselves is not a good reason to destroy ourselves harder.

pksebben · on March 15, 2023

Not exactly what I meant; there is a nonzero chance that an AGI given authority over humanity would run it better. Granted, a flipped coin would run it better but that's kinda the size of it.

FeepingCreature · on March 23, 2023

Right, and if we explicitly aimed for building a good AGI we could maybe get that chance higher than small.

the8472 · on March 13, 2023

For smaller values of doom. The one he's talking about is unaligned AGI doing to humans what humans did to Xerces blue.

pharke · on March 13, 2023

LLMs will never be AGI

oceanplexian · on March 13, 2023

I see only two outcomes at this point. LLMs evolve into AGI or they evolve into something perceptually indistinguishable from AGI. Either way the result is the same and we’re just arguing semantics.

pharke · on March 14, 2023

Explain how a language model can “evolve” into AGI.

the8472 · on March 13, 2023

It's like saying an 8086 will never be able to render photorealistic graphics in realtime. They fuel the investment in technology and research that will likely lead there.

zirgs · on March 13, 2023

How are you going to make this tech illegal? Raid everyone's home and check if they have it on their computer? Treat AI models like CSAM or something?

Name_Chawps · on March 13, 2023

[flagged]

NayamAmarshe · on March 13, 2023

This isn't even comparable to a nuke. This kind of opinion is going to leave our entire species behind.

Imagine having a patent on 'fire' and then suing everybody who tries to cook a meal.

consumer451 · on March 13, 2023

> leave our entire species behind

Leave us behind whom or what?

I agree with gp. It may not be LLMs, but we will certainly create a technology at some point that can't be openly shared due to existential danger, aka The Great Filter.

We can't just naively keep frolicking through the fields forever, can we?

We have to be able to at least agree on that, theoretically, right?

7to2 · on March 13, 2023

If we agreed with your premise that AI is a great filter and that this filter can somehow be contained by a small group, then I guess what it boils down to is two choices:

1. either lock everything down and accept the control of a small unaccountable group to dictate the future of humanity according to their morals and views - and I believe that AI will fundamentally shape how humanity will work and think, or 2. continue to uphold the ideas of individual freedom and democratic governance and accept a relative increase in the chance of a great filter event occurring.

I, like many here, am firmly against ggp's position. The harm that our spices sustains from having this technology controlled by the few far outweighs the marginal risk increase of some great filter occurring.

I will continue to help ensure that this technology remains open for everyone regardless their views, morals, and convictions until the day I die.

consumer451 · on March 13, 2023

Let's forget today, and LLMs. Do you see no theoretical future case where a technology should not be shared freely, ever? Even 100 years from now?

The only benefit I can imagine of less players having control of a technology is that there are less chances for them to make a bad call. But when you democratize something you hit the law of large numbers.

https://en.wikipedia.org/wiki/Law_of_large_numbers

disclaimer: this goes against so much of what I believe, but I can't escape the logic.

NayamAmarshe · on March 13, 2023

> Leave us behind whom or what?

Whom: The corporations with enough money to burn.

What: Technological progress.

Here's a nice video that showcases the same patterns in history and how having free and open tech + breaking monopolies helped move society forward - https://youtu.be/jXf04bhcjbg

Name_Chawps · on March 13, 2023

It's not comparable to a nuke because a nuke is dumb, and won't be dangerous unless you do something dangerous with it.

AI, on the other hand, will be dangerous by default, once it's powerful enough.

pharmakom · on March 13, 2023

Given the non zero risk of an accidental nuclear launch I’m not so sure.

It’s like balancing a piano on a ledge above a busy street and saying “well if no one pushes it then it’s not dangerous!”

Nuclear war and climate change rank far higher as threats than rogue AI to me right now.

numpad0 · on March 13, 2023

Fire is dangerous by default too.

recuter · on March 13, 2023

Language models don't kill people, people kill people. You know what stops a bad ̶g̶u̶y̶ mega-corporation with a language model? A good guy with a language model.

Here is what mine had to tell you:

  It’s not like we don’t already have nuclear weapons, biological agents, chemical agents etc...

  AI is simply another tool which can be used for good or ill. It doesn’t matter how much regulation/control you put on it - if someone really wanted to use it maliciously then they will find ways around your safeguards. The best thing to do is educate yourself as much as possible.

(sampling parameters: temp = 100.000000, top_k = 40, top_p = 0.000000, repeat_last_n = 256, repeat_penalty = 1.176471)

kragen · on March 13, 2023

the smallpox genome has been open-source since i think 01996 https://www.ncbi.nlm.nih.gov/nuccore/NC_001611.1

rhtgrg · on March 13, 2023

It's not that we don't "know" how to do these things, most of us are just resource-constrained. Interestingly, that's similar to the issues with GPT-3 et al. People aren't saying "give us the secret sauce", they're saying "it's problematic for corporations to be the sole custodian of such models".

What would you think of a world where only one country has nukes (due to a monopoly on radioactive fuel, rather than a monopoly on knowledge)?

flangola7 · on March 13, 2023

> What would you think of a world where only one country has nukes (due to a monopoly on radioactive fuel, rather than a monopoly on knowledge)?

This is more like giving every individual on Earth the nuclear launch codes. It only takes one stupid or malicious person to press launch. Giving more people buttons is not how you avoid thermonuclear war.

TuringTest · on March 13, 2023

This is like giving every individual on Earth the nuclear launch codes, without the warheads being attached to launch rockets.

To do serious harm or have broad social control requires concentrating that power with an infrastructure that a small group does not have, it requires coordinating the resources of a broad social base. And at that point the incentives to use them are affected by the needs of many people.

flangola7 · on March 13, 2023

Advanced AI is the warhead.

>requires concentrating that power with an infrastructure that a small group does not have, it requires coordinating the resources of a broad social base.

These are all things intelligence (artificial or otherwise) can help acquire. It listens, thinks, and responds. Genghis Khan, Adolf Hitler, and Albert Einstein are all intelligences that resulted in dramatic tangible, physical changes to our world almost entirely by listening, processing, and responding with human language.

A small number of slow and weak apes came to have absolute unilateral control over the destiny of all other lifeforms because of intelligence. The power, infrastructure, and resources you speak of were not available in 15,000 BCE, yet somehow they exist today.

TuringTest · on March 15, 2023

> Advanced AI is the warhead.

In that case, this current batch is not "advanced AI". It is a big autocomplete panel which retrieves content already present in its training corpus, and presents it in new shapes; it is not ready to define new goals with a purpose and follow them to completion.

wkat4242 · on March 13, 2023

A language model isn't Skynet :)

anigbrowl · on March 13, 2023

I'm working on it

tandr · on March 13, 2023

...yet?

ChatGTP · on March 13, 2023

There is nothing to suggest a language model is self aware, or is capable of reasoning and will turn itself around to kill you or anyone else. Knowledge is power and it’s better to get clued up on how these things work so you don’t scare yourself.

wkat4242 · on March 13, 2023

Indeed. I think the confidence with which ChatGPT gives (often incorrect) answers and the way you can correct it, makes people feel like it is self-aware but it's not. The way it is presented really makes it easy to anthropomorphise it. It feels like you're talking to a person but really what you're talking to is the echoes of a billion people's murmurs on the internet.

There is a really big step to go for it to be self-learning which is what is one of the things it will need to be self-aware. Right now the tech is just a static model - it will not learn from being corrected. You can often argue with it saying "Hey this is wrong because..." and it will admit you're right. And then it will give the wrong initial answer back the next time.

ChatGTP · on March 13, 2023

I think AI has a bit of a branding issue.

MagicMoonlight · on March 13, 2023

Both of those are freely available… the limit is resources not knowledge.

xiphias2 · on March 13, 2023

It’s not the opinion that is getting the species killed: it’s just nature, we can’t do anything about it, otherwise we would have seen aliens already.

ojosilva · on March 12, 2023

Excellent packaging OP! I just wanted to say 2 things relating to LLaMa:

1) 7B is unusable for anything really, in case you are hopeful;

2) 68B otoh is awesome ("at least DaVinci level").

I don't know if this is something FB/Meta planned strategically but this LLaMa-mania (LLaMania?) over the weekend is their November/2022 chatGPT moment. If they (Mark) take it seriously, it could become a strong hand in AI and a hint of how the industry could be shaped in the near future, with cloud models competing with local installs.

Think about it: who ever trains a popular, albeit closed model, can give it whatever bias it wishes with nearly no oversight. A dystopian and scary thought.

jacooper · on March 13, 2023

Something important, is that LLama was leaked, it was never directly published by Meta. So its basically piracy, and even if you got it officially, the license is very restrictive.

EMIRELADERO · on March 13, 2023

I dispute that the model can be copyrightable in the first place.

coldtea · on March 13, 2023

As long as the courts don't dispute it, then our disputes don't matter.

They'd be no better than some "sovereign citizen" disputing their arrest...

MacsHeadroom · on March 13, 2023

The idea that models can't be copyrighted isn't far fetched. The basic idea is that models are created by an automated process not by a person.

The courts have already upheld that AI generated output is not copyrightable for this exact reason.

So if you do not buy that it applies to models then you would have to explain the difference between the process which outputs bits into a model's layers (aka training) and the process which takes bits into the input layer and then dumps out the subsequent bits of the output layer (inference /generation).

Then explain why that distinction is different in regards to the applicability of copyright.

coldtea · on March 13, 2023

I'm not sure that even the "AI generated output is not copyrightable" stance will be maintained - as long as "AI generated output" becomes big business. Same way copyright was invented and Sonny-Bono-extended to the max as long as content became big business.

In the model's case, though, it's even easier why it could be copyrightable, as a "baked" model is still created by people fine-tuning it, setting parameters and hardcoded stuff, training it with this or that set and excluding other, and so on.

For example music composed and rendered as audio by generative algorithms (something which doesn't even need AI, just some rules and stohastic processes) has been created and copyrighted just fine for decades...

wongarsu · on March 13, 2023

All the arguments for why photographs are copyrighted would seem to apply. The photographer isn't painting the image, but his artistic input is still vital to creating the image. Same with training these models: the training is just an algorithm on some data, but choosing the right hyperparameters and training data is an artistic expression of the author, making copyright apply

MacsHeadroom · on March 15, 2023

If a non-human presses the button on the camera the photograph is not copyrightable even if a human set up the camera intending for the non-human to press it. https://en.wikipedia.org/wiki/Monkey_selfie_copyright_disput...

For the same reasons this monkey photo cannot be copyrighted it is highly likely that AI generated art is uncopyrightable and that would also mean that models are. The fact that humans set up the systems which produce the art/models with the intention of getting an end results generally like the one they get is simply not meaningful to the copyright dispute.

Breza · on March 19, 2023

You can restrict someone with a license even if you can't copyright the underlying technology.

DennisP · on March 13, 2023

> who ever trains a popular, albeit closed model, can give it whatever bias it wishes with nearly no oversight.

That's true even if you can download the whole model. It's not like we can figure out what it's doing from looking at the weights. Training the model locally might avoid intentional bias, but that's what takes a huge GPU farm.

oceanplexian · on March 13, 2023

> Think about it: who ever trains a popular, albeit closed model, can give it whatever bias it wishes with nearly no oversight. A dystopian and scary thought.

You have perfectly described what OpenAI did. They released a moralizing “biased” model behind a gated API with no oversight. The only dystopia is one in which corporations get to decide what is, or isn’t considered biased.

boredemployee · on March 13, 2023

sorry for the extremely dumb question but is it possible to run the 68B model in a 8gb ram computer?

_kuvn · on March 13, 2023

in general, assume 2GB per billion parameters - with quantisation you can get this down to <1GB (~500MB for 3 bit?), but even with that you'll only be able to run quantised llama-13B in the best case

Having said that: if you are feeling incredibly patient you can technically run the 68B parameter model by swapping to disk, although it will not be a pleasant experience (think minutes or hours per token instead of tokens per second)

Additionally worth noting pure CPU inference is much slower than GPU/TPU inference, so the output will be much slower than a ChatGPT-like service even if it does fit in your computer's RAM

boredemployee · on March 13, 2023

thanks for explaining! How much GPU memory would work nice with 68B?

ukd1 · on March 13, 2023

they said 2g per 1 billion....and it's called 68B...I presume that's 68 billion... 68*2...so at least 136g?

vishal0123 · on March 13, 2023

68/2, not 682

boredemployee · on March 13, 2023

So, if I understand correctly, that's what you need to run the best model?

With GPU:

VRAM + RAM >= 68/2

Without GPU:

RAM >= 68/2

coldtea · on March 13, 2023

Not sure about the "=" part. You'd want some memory for the compositor and other OS graphics, and regular RAM for OS and programs, no?

Tepix · on March 13, 2023

You can't, it needs around 40GB of RAM.

Technically you can by swapping to disk but it would be too slow to be usable.

What you can do however is use the 7B model with 4bit quantization and use it within 8GB RAM.

EGreg · on March 13, 2023

Is this 68B of RAM?

How do you get access to that on a Macbook?

junipertea · on March 13, 2023

That’s 68 billions of parameters. It probably does not fit on ram. Though If you encode each parameter using one byte, you would need 68GB RAM which you could get on workstations at this point.

terafo · on March 13, 2023

It fits, whisper.cpp uses 4 bit quantization, 13B model takes a little bit more than 8gb and around 9gb ram while inferencing.

gymbeaux · on March 13, 2023

Everyone with “only” 64GB of RAM is pouting today, including me

numpad0 · on March 13, 2023

More like finally "proven right" to have needlessly kept feeding 4/5th of 64GB to Chrome since 2018

Taek · on March 13, 2023

You can run llama using 4 bits per parameter, 64 GB of RAM is more than enough

geysersam · on March 13, 2023

4 bits is ridiculously little. I'm very curious what makes these models so robust to quantization.

MacsHeadroom · on March 13, 2023

Read The Case for 4 Bit Precision. https://arxiv.org/abs/2212.09720

Spoiler: it's the parameter count. As parameter count goes up, but depth matters less.

It just so happens that at around 10B+ parameters you can quantize down to 4bit with essentially no downsides. Models are that big now. So there's no need to waste RAM by having unnecessary precision for each parameter.

Taek · on March 13, 2023

For completeness, there's also another paper that demonstrated you get more power/accuracy per-bit at 4 bits than at any other level of precision (including 2 bits and 3 bits)

MacsHeadroom · on March 14, 2023

That's the paper I referenced. But newer research is already challenging it.

'Int-4 llama is not enough [0] - Int-3 and beyond' suggests 3-bit is best for models larger than ~10B parameters when combining binning and GPTQ.

[0] https://nolanoorg.substack.com/p/int-4-llama-is-not-enough-i...

metadat · on March 13, 2023

What if you have around 400GB of RAM? Would this be enough?

gymbeaux · on March 13, 2023

What I'm referring to requires around 67GB of RAM. With 400GB I would imagine you are in good shape for running most of these GPT-type models.

taf2 · on March 13, 2023

Seems to use about 40~ GB RAM here...

chocolatkey · on March 13, 2023

I'm pretty sure there's a mistake here: https://github.com/cocktailpeanut/dalai/blob/main/index.js#L... , there's a ${suffix} missing

It causes the quantization to process to always use the first part of the model if using a larger size than 7B. I don't even know what this stuff does, but I see the ggml-model-f16.bin files have ggml-model-f16.bin.X as well in the folder, so I'm pretty sure this is a mistake. Maybe it's causing the loss of accuracy?

Tepix · on March 13, 2023

Good catch. For the 7B model it doesn't matter, but all others will be ruined.

noduerme · on March 13, 2023

Well, after downloading the whole 65B model, I got it to talk on an M1 Max MBP (64Gb RAM). Unfortunately, all it says no matter what I prompt it is some combination of these words:

Elizabethêteator Report Terit Elizabethête estudios политичеSM Elizabethunct styczniarequire enviçasefша sufficient vern er Dependingêque политиче Emperor!\ющим quarterктиче Elizabeth estudiosête ElizabethBasicCONFIGSM estudios political book

[edit] btw I'm not making this up; just curious if anyone else has had this ridiculous experience.

geysersam · on March 13, 2023

Another answer in the thread said this:

> I'm pretty sure there's a mistake here: https://github.com/cocktailpeanut/dalai/blob/main/index.js#L... , there's a ${suffix} missing

> It causes the quantization to process to always use the first part of the model if using a larger size than 7B. I don't even know what this stuff does, but I see the ggml-model-f16.bin files have ggml-model-f16.bin.X as well in the folder, so I'm pretty sure this is a mistake. Maybe it's causing the loss of accuracy?

Perhaps that's the issue?

festive-minsky · on March 15, 2023

Did you manage to fix this? I'm having the same issue

grensley · on March 13, 2023

I am currently having the same experience

_gfwu · on March 12, 2023

Best name for a software project I've seen in a long time hands down!

coldtea · on March 13, 2023

It really whips the llama's ass!

Breza · on March 19, 2023

Greetings fellow millennial!

userbinator · on March 13, 2023

I was expecting that reference would be made soon after LLaMA was announced, but it doesn't quite beat this: https://news.ycombinator.com/item?id=35094442

ilrwbwrkhv · on March 12, 2023

I don't think anybody would have the guts to do this with Muhammad or the Quran.

sp332 · on March 12, 2023

Well the Dalai Lama famously has a better sense of humor.

sieste · on March 12, 2023

He'll be like "Llamaste, guys!".

antibasilisk · on March 12, 2023

Yeah I don't really think the name of the project is very appropriate.

stavros · on March 12, 2023

Can we distinguish "something is offensive" from "something is being mentioned"? What is the perceived offense you see here towards the Dalai Lama?

serf · on March 12, 2023

I don't share the belief, but i've heard it said from others with such beliefs that the naming association is offensive by itself because of the relative importance of the figures.

imagine that 'Fabio' is the spiritual leader of your religion, a walking talking deity among humans on Earth. You worship Fabio with all of your effort, and believe he is infallible. Your culture has precepts that forbid the casual use of Fabio's name in petty regard.

On the other side of the Earth, at the same time, is someone who names their new powerboat 'Fabio'.

I perceive it as that kind of offense. The (so-called) 'petty' use of a word that drives much stronger emotion in others.

That said, I don't share the belief -- and I like such names; but I can understand the conflict.