Hacker News new | past | comments | ask | show | jobs | submit login

I'll go with an example to demonstrate why that's not always enough. Many people are quite keen to know what this (for example) is actually doing:

  float InvSqrt(float x){
      float xhalf = 0.5f * x;
      int i = *(int*)&x;
      i = 0x5f3759df - (i >> 1);
      x = *(float*)&i;
      x = x*(1.5f - xhalf*x*x);
      return x;
  }
From https://betterexplained.com/articles/understanding-quakes-fa...

In my case I don't have a huge amount of time to chase down every rabbit hole, but I'd love to accelerate intuition for LLM's. Multiple points of intuition or comparison really help. I'm also not a python expert - what you see and what I see from a line of code will be quite different.




The author is attempting to build an explicit mental model of what a bunch of weights are "doing". It's not really the same thing. They are minimizing the loss function.

People try to (and often do) generate intuition for architectures that will work given the lay out of data. But, the reason models are so big now is that trying to understand what the model is "doing" in a way humans understand didn't work out so well.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: