Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
fredliu
on Sept 14, 2023
|
parent
|
context
|
favorite
| on:
Efficient Memory Management for Large Language Mod...
I might be wrong, but looks like this could help with speculative decoding which can already vastly improves the inference speed?
Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: