Hacker News new | past | comments | ask | show | jobs | submit login

Tin foil hat, anyone run it and use wireshark to see if it doesn't make external requests (unless it had to like a browser agent)



It’s just numbers. You use other open source software to run it.


It's not even "can". You have to use your own software to run it. DeepSeek hasn't published anything other than the model weights.


The model is. How it is packaged is a different matter entirely. There is a good reason we saw a shift towards the safetensors format.

https://arjancodes.com/blog/python-pickle-module-security-ri...


It's been confirmed to run on a machine with no internet access. So it isn't reliant on external requests, though it could still be trying to make them.


I've done this (not thoroughly by any means) with OpenSnitch on the Ubuntu machine I have ollama installed on running the 32b R1 weights. No network traffic.

I'm not entirely sure if it is possible to do some type of code execution like that in just the weights themselves, though someone else who knows a bit more about this can weigh in here.


Isn't DeepSeek simple/small enough you can run it locally?


No, the full R1 model is ~650GB. There are quantized version that quantize it down to ~150GB.

What you can run locally are the distilled models, that is actually LLama and Qwen weights further trained on R1's output


At least a TB of VRAM to load it in fp16. They distilled to smaller models, which do not perform as well, but can be run on a single GPU. Full R1 is big though.


fp16? I thought it was trained at fp8.

Yeah, I would want to double check and confirm they're using safe serialization methods at the very least before using the weights from any model released by a Chinese entity




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: