I see what you mean, and it's indeed quite likely that texts containing such hyp... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

willbudd on April 24, 2023 | parent | context | favorite | on: Scaling Transformer to 1M tokens and beyond with R...

I see what you mean, and it's indeed quite likely that texts containing such hypothetical scenarios were included in the dataset. Nonetheless, the implication is that the model was able to extract the conditional represented, recognize when that condition was in fact met (or at least asserted: "The queen died."), and then apply the entailed truth. To me that demonstrates reasoning capabilities, even if for example it memorized/encoded entire Quora threads in its weights (which seems unlikely). If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.

nl on April 25, 2023 [–]

Yes, this.

There's clearly an internal representation of the relationships that is being updated.

If you follow my Twitter thread it shows some temporal reasoning capabilities too. Hard to argue that is just copied from training data: https://twitter.com/nlothian/status/1646699218290225154

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact