Hacker News new | past | comments | ask | show | jobs | submit login
YaFSDP: a sharded data parallelism framework, faster for pre-training LLMs (github.com/yandex)
135 points by wiradikusuma 6 months ago | hide | past | favorite | 16 comments



I was surprised to see that the Ya part meant “Yet another”. I mean, I’ve seen it before in many acronyms. But it’s pretty tongue in cheek of them to do that here since one would expect it was just because it was made by Yandex.


I was expecting it to be a Russian acronym starting with the letter Я which is pronounced Ya. It acquired its backward R glyph when it was changed from an old Slavic letter I cannot draw.


What do you mean that you "cannot draw" them? This is a digital medium and both (well one, the other is half supported) variants are valid Unicode glyphs:

ꙗ / Ѧ

Or do you mean you literally can't draw them?


I could not reproduce it by hand without a reference nor do I have a keyboard installed which offers it as a symbol, nor was I going to look it up to add the spice to a shower thought tier HN comment.

I see you have provided it, making it more accessible for my future use, at least on the timeframe of this thread being in my recent HN activity.


It's an idiomatic expression in slavic languages, indicating that the shape is particularly complex.


I'm pretty sure that it's an idiomatic phrase in any language for that.


Doesn't Yandex itself come from "Yet Another Indexer"?


Ah, so it does as well! I only knew that it was a portmanteau of “Я” and “index”. As in “I index”. Which it also is.


There's a third explanation of "Яndex" being "языковой индекс" i.e. "language-aware index". Russian language have complicated morphology with three genders and six grammatical cases, somewhat similar to Latin. Searching by an exact word-match almost never gives good results, and neither Yahoo nor AltaVista could offer any better in 1997 -- hence Yandex was built.


You mean, Yet Another Human-Organized Ontology?


Any idea on what are the main tricks used to achieve gains over fsdp?


The blog post seems to contain more details and the core ideas: https://medium.com/yandex/yafsdp-a-tool-for-faster-llm-train...


Odd that they don’t expand on this:

In Yandex’s pre-trainings, the implementation of YaFSDP along with other memory optimization strategies resulted in a speed gain of 45%.


[flagged]


Bulgarians are just as smart - it's an EU country and the language is easier to learn than Russian. Здравейте! Many other benefits to hiring Bulgarians also.


[flagged]


This is true. Some of them could also be good persons, just like absolutely anywhere else in the world. Generalization is the great cow’s middle state.


If the code is good and open source why should I care about their personal life? Ship it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: