Rotary GPU: Exploring Local Execution for Large MoE Models Under Limited VRAM

(arxiv.org)

19 points | by dryarzeg  5 hours ago

3 comments