Submissions from github.com/turboderp

		Exllamav2: Inference library for running LLMs locally on consumer-class GPUs (github.com/turboderp)
		322 points by Palmik on Sept 13, 2023 \| past \| 125 comments
		ExLlama: Memory efficient way to run Llama (github.com/turboderp)
		3 points by Palmik on Aug 16, 2023 \| past