I've been wondering how they've been able to be so generous with Composer usage with it still making business sense. Seems like this is the answer: presumably they think they'll have a competitive advantage in not just the UX space but the model space as well soon. It's a great strategy, but I do wonder if the moat will be big enough with how fast things are moving and how competitive the model landscape is.
>We used a Kimi base, with midtraining and RL on top. Going forward, we'll include the base used in our blog posts, that was a miss. Also, the license is through Fireworks.
[0]
And still no mention of Kimi in a new blog post :)
Also apparently the inference provider they use, Fireworks AI, already has built-in API for RL tuning Kimi [1], so I wonder which parts are Cursor's own effort and where Fireworks AI actually deserves credit, especially since they repeatedly brag about being able to create a new checkpoint every 5 hours, which would be largely thanks to Fireworks AI's API/training infrastructure.
I mean, I'm genuinely curious how much effort it would actually take me to go from "here, lots of user data" to "the model gains +1% on benchmarks" to produce my own finetune, assuming I already use a good existing foundational model, my inference provider already handles all the tuning infrastructure, and I already have a lot of usage logs.
I've been wondering how they've been able to be so generous with Composer usage with it still making business sense. Seems like this is the answer: presumably they think they'll have a competitive advantage in not just the UX space but the model space as well soon. It's a great strategy, but I do wonder if the moat will be big enough with how fast things are moving and how competitive the model landscape is.
>We used a Kimi base, with midtraining and RL on top. Going forward, we'll include the base used in our blog posts, that was a miss. Also, the license is through Fireworks. [0]
And still no mention of Kimi in a new blog post :)
Also apparently the inference provider they use, Fireworks AI, already has built-in API for RL tuning Kimi [1], so I wonder which parts are Cursor's own effort and where Fireworks AI actually deserves credit, especially since they repeatedly brag about being able to create a new checkpoint every 5 hours, which would be largely thanks to Fireworks AI's API/training infrastructure.
I mean, I'm genuinely curious how much effort it would actually take me to go from "here, lots of user data" to "the model gains +1% on benchmarks" to produce my own finetune, assuming I already use a good existing foundational model, my inference provider already handles all the tuning infrastructure, and I already have a lot of usage logs.
[0] https://news.ycombinator.com/item?id=47459529
[1] https://fireworks.ai/blog/kimi-k2p5
I'd love to see some data for how much it has improved via this process in the last week
It would be the same as kimi k2.5, the underlying model