Hi HN, we're releasing weights for our latest text to image model and publishing this writeup on how we trained it in quite a bit of depth.
I hope there is something in the report for everyone, we included a fair bit on the actual training and data infrastructure usually not written about much, that I think will be interesting to people here. There's more that didn't fit, happy to answer questions!
This is a massive technical report for an open weights image gen model. As someone who has followed this space closely, it’s really cool to read about the behind-the-scenes experimentation and effort that went into the final product. I hope you will release some of the find tuning tools so the community can experiment with them as well and really push what the model’s capable of.
Hi HN, we're releasing weights for our latest text to image model and publishing this writeup on how we trained it in quite a bit of depth.
I hope there is something in the report for everyone, we included a fair bit on the actual training and data infrastructure usually not written about much, that I think will be interesting to people here. There's more that didn't fit, happy to answer questions!
This is a massive technical report for an open weights image gen model. As someone who has followed this space closely, it’s really cool to read about the behind-the-scenes experimentation and effort that went into the final product. I hope you will release some of the find tuning tools so the community can experiment with them as well and really push what the model’s capable of.
Interesting item on the careers page btw. For anyone that knows what older school Mellanox was about, it might be your kind of thing: https://jobs.ashbyhq.com/krea/ebe94024-eef6-4306-a019-10072a... :D
Turbo appears GGUF'd already: https://huggingface.co/Abiray/Krea-2-Turbo-GGUF
It's a good model sadly the use of the qwen vae is a bit of a downer.
It's been mentioned by some that using the wan2.1 vae instead solves this. I haven't personally had time to try yet.