Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint

(modal.com)

78 points | by charles_irl  11 hours ago

18 comments