Submissions from github.com/huawei-csl

		KVarN: Native vLLM backend for KV-cache quantization by Huawei (github.com/huawei-csl)
		115 points by theanonymousone 10 hours ago \| past \| 12 comments
		Sinkhorn: Make LLMs even smaller through quantisation while maintaining accuracy (github.com/huawei-csl)
		4 points by ilitirit 8 months ago \| past \| 1 comment