Adaptive speculative decoding: picking draft lengths at runtime

(fergusfinn.com)

2 points | by hasheddan  8 hours ago

No comments yet.