Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: 16 yo Nephew, in E. Africa, wants to train an LLM with on disk Wikipedia
14 points by a_w 15 days ago | hide | past | favorite | 17 comments
Hello HN!

My 16 year old nephew lives in an East African nation where there is practically no internet access.

Last week he asked me for advise as to how to go about training an open source LLM using an on disk Wikipedia (~80 GB).

Any suggestions? Thanks!




In addition to the other great suggestions, point him to Karpathy's YouTube channel[1]. Karpathy has an approachable communication style.

Here's his "1 hour intro to LLMs" video: https://www.youtube.com/watch?v=zjkBMFhNj_g

1. https://www.youtube.com/c/AndrejKarpathy


Thanks!

I will try to download it and send it to him.


Not an expert, but maybe using RAG/embeddings on the on-disk wikipedia would be better than finetuning on wikipedia?

Most decent LLMs probably were already trained on wikipedia, that doesn't stop them from hallucinating when asked questions about it.


Thanks for the suggestion! I will look into this.

^ This is the way

Use a model already trained on Wiki[edia using llamafile.

You can download llamafile and several models, put them on a USB drive or hard drive, them send the drive to him via DHL.


That is a great suggestion, thank you!

I think he wants to tinker, and learn more about how they work. What I neglected to mention is that he's already learned to program (developing Android apps, and he's also learned Python). He is a very bright and curious kid.


Have him check out:

LLM training in simple, raw C/CUDA

----------------------------------

https://github.com/karpathy/llm.c

It is only 1,000 lines of easy to read C code. There is also Python reference code.


Btw, I support some Kenyan high school students and am looking at supplying a few schools with llamafile+models on flash drives for their computer science curricula.

That's interesting. Could you expand on this a bit more? Which models, and I am curious about how the CS teachers/students will be using this?

I'm reviewing models, at the moment. Model selection will depend greatly on the hardware capabilities at each school. Phi-3 could be a good starting point.

The project is an idea at the moment. My contact in Kenya has direct access to the Principals of the schools that our supported students attend.

My thought is that the teachers would not have to do much. Many of the students already know python and could do self-learning individually or in groups.

A flash drive with llamafile+models and documentation might be all that it would take to get them started - even offline.

Bonus: Using llamafile, the same binary distribution works on MacOS, Linux, and Windows.


Thanks for the detailed response.

I wasn't aware of Phi-3 - I will look into it.


Would it be possible to ship him a Starlink terminal? Internet access could do wonders for a young interested guy like that... And he could share that connectivity with people around him too.

Unfortunately, it doesn't look like Starlink is an option.

https://news.ycombinator.com/item?id=40246021


I have been thinking about that, but I haven't gotten around to researching its availability in the country yet.

I will do some research over the weekend. Thanks for mentioning it!


What kind of GPUs does he have?


I believe he has a laptop with an Intel i5 with integrated graphics.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: