Hacker News new | past | comments | ask | show | jobs | submit login

For a century, transformer meant a very different thing. Power systems people are justifiably amused.



And it means something else in Hollywood. But we are discussing language models here, aren’t we?


And it fits the definition doesn't it since it tokenizes inputs to compute them against pre-trained ones, rather than being based on rules/lookups or arbitrary logic/algorithms?

Even in CSS a matrix "transform" is the same concept - the word "transform" is not unique to language models, more a reference to how 1 set of data becomes another by way of computation.

Same with tile engines / game dev. Say I wanted to rotate a map, this could be a simple 2D tic-tac-toe board or a 3D MMO tile map, anything in between:

Input

[

  [0, 0, 1],
    
  [0, 0, 0],
    
  [0, 0, 0]
]

Output

[

  [0, 0, 0],

  [0, 0, 0],
  
  [0, 0, 1]
]

The method that takes the input and gives that output is called a "transformer" because it is not looking up some rule that says where to put the new values, it's performing math on the data structure whose result determines the new values.

It's not unique to language models. If anything vector word embeddings are much later to this concept than math and game dev.

An example of use of word "Transformer" outside language models in JavaScript is Three.js' https://threejs.org/docs/#examples/en/controls/TransformCont...

I used Three.js to build https://www.playshadowvane.com/ - built the engine from scratch and recall working with vectors (e.g. THREE Vector3 for XYZ stuff) years before they were being popularized by LLMs.


Wait, do you really not know what a transformer is in the context of ML? It’s been dominating the field for 7 years now.


Can't read? I just explained thoroughly what it is in the comment above. Do you understand what matrix transformations are?

Do you know that a vector in LLMs for word embeddings is the same thing as a vector in 3D game dev libraries like Three.js?

Sounds like you 2 are the only ones who don't get it.


Please do yourself a favor and google “transformer paper”. Open the very first result and read the pdf. Hopefully it will become clear what people mean when they say “transformer” in ML context, and you will finally realize how silly you look like in this thread.


You guys both broke the site guidelines badly in this thread. We have to ban accounts that post like this, so please don't.

If you'd please review https://news.ycombinator.com/newsguidelines.html and stick to the rules when posting here, we'd appreciate it.


You still don't get it. For LLMs a "transformer architecture" only means one that:

- Tokenizes sequences

- Converts tokens to vectors

- Performs vector/matrix transformations

- Converts back to tokens

The matrix transformation part is why it's called a "transformer". Do some reading yourself https://en.wikipedia.org/wiki/Transformer_(deep_learning_arc...

> how silly you look

You'll look twice as silly after thinking vectors are unique to LLMs, or that the word "transformer" has anything to do with LLMs rather than lower-level array math.

Consider that a "vector database" is a very specific technology - yet the word "vector" is not off limits in other database related libraries, especially if dealing with vectors.

In any case - if you think I'm trying to pass it off as something else, what I call "transformer" does tokenize lots of text (breaks it down by ~word, ~pixel) and derives semantic values (AKA trains) to produce real-time completions to inputs by way of math, not lookups. It fits the definition even in that sense where "transformer" meant something more abstract than the mathematical term.


You guys both broke the site guidelines badly in this thread. We have to ban accounts that post like this, so please don't.

If you'd please review https://news.ycombinator.com/newsguidelines.html and stick to the rules when posting here, we'd appreciate it.


I didn't know it was that strict, no offense to the other poster, it was just a little disagreement :)




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: