KVarN: Native vLLM backend for KV-cache quantization by Huawei

(github.com)

66 points | by theanonymousone  3 hours ago

7 comments