obviously theres also a multimodality gap to be overcome to intimately understan...

obviously theres also a multimodality gap to be overcome to intimately understanding the $100M screw to turn, but i suspect most reasoning that matters has already been translated and is embedded into/in words. i wouldn’t underestimate the amount of useful knowledge that exists in embedding advance texts into an LLm model. the challenge is contextually hierarchalizing it (a matter of reasoning) and decoding it back into reality (words are dimensional squished encodings of reality).