Hacker News new | past | comments | ask | show | jobs | submit | from login
vLLM v0.6.0: 2.7x Throughput Improvement and 5x Latency Reduction (vllm.ai)
3 points by xmo 5 days ago | past | discuss
VLLM automatic prefix / prompt caching (vllm.ai)
2 points by danielhanchen 16 days ago | past | 1 comment
VLLM hosts local LLMs easily (vllm.ai)
2 points by myprotegeai 37 days ago | past
Llama 3.1 Support in VLLM (vllm.ai)
2 points by e12e 49 days ago | past
vLLM (vllm.ai)
2 points by jonbaer 4 months ago | past
VLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention (vllm.ai)
2 points by udev4096 8 months ago | past
Notes on VLLM v.s. DeepSpeed-FastGen (vllm.ai)
3 points by Palmik 10 months ago | past
vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention (vllm.ai)
295 points by wskwon on June 20, 2023 | past | 42 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: