ZAYA1-8B matches DeepSeek-R1 on math with less than 1B active parameters

(firethering.com)

70 points | by steveharing1  12 hours ago

49 comments