Run a 1T parameter model on a 32gb Mac by streaming tensors from NVMe

(github.com)

105 points | by tatef  2 hours ago

51 comments