vLLM large scale serving: DeepSeek 2.2k tok/s/h200 with wide-ep

(blog.vllm.ai)

52 points | by robertnishihara  11 hours ago

2 comments