Hacker News new | past | comments | ask | show | jobs | submit login

I wonder if in an application you could branch on something more abstract than tokens. While there might by 50k token branches and 1k of reasonable likelihood, those actually probably cluster into a few themes you could branch off of. For example β€œhe ordered a …” [burger, hot dog, sandwich: food] or [coke, coffee, water: drinks] or [tennis racket, bowling ball, etc: goods].





I was thinking along the same lines and I think where I end up is realizing searching though the possible space of token sequences isn't the way to do it as the text output space is too far removed from the ground truth for humans. As in text is already a biased projection of reality by humans, now we're searching through an LLM's crude estimation of a biased projection of this?

I think a deeper architectural change involving searching internally in the latent space of a model is required. That way we could search in "thought space" instead of "text space" and maybe then only convert the output to text after searching.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: