Hacker News new | past | comments | ask | show | jobs | submit | from login
How Do Language Models Put Attention Weights over Long Context? (yaofu.notion.site)
2 points by jxmorris12 6 months ago | past
Towards 100x Speedup: Full Stack Transformer Inference Optimization (yaofu.notion.site)
1 point by magoghm on Dec 10, 2023 | past
A Stage Review of Instruction Tuning (yaofu.notion.site)
2 points by panabee on June 29, 2023 | past
Towards Complex Reasoning: The Polaris of Large Language Models (yaofu.notion.site)
2 points by bluehat974 on May 31, 2023 | past
Towards Complex Reasoning: The Polaris of Large Language Models (yaofu.notion.site)
2 points by tim_sw on May 3, 2023 | past
How Does GPT Obtain Its Ability? Tracing Emergent Abilities of Language Models (yaofu.notion.site)
4 points by johnthewise on April 18, 2023 | past
How does GPT obtain its ability? Tracing emergent abilities of language models (yaofu.notion.site)
414 points by headalgorithm on Dec 14, 2022 | past | 192 comments
How Does GPT Obtain Its Ability? Tracing Abilities of LMs to Their Sources (yaofu.notion.site)
2 points by magoghm on Dec 13, 2022 | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: