Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
QuadmasterXLII
5 months ago
|
parent
|
context
|
favorite
| on:
“Emergent” abilities in LLMs actually develop grad...
The tokens are the structure over which the attention mechanism is permutation equivariant. This structure permeates the forward pass, its important at every layer and will be until we find something better than attention.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: