Accelerating Gemma 4: faster inference with multi-token prediction drafters

(blog.google)

180 points | by amrrs  3 hours ago

62 comments