Cool that it's possible but basically unusable performance characteristics. For an 8192 token prompt they report a ~1.5 minute time-to-first-token and then 8.30tk/s from there. For context ChatGPT is typically <<1s ttft and ~50tk/s.
Framework has gone fully in the tank of Apple consumerization route of unrepairability and unupgradeability with a nonstandard machine, soldered-on RAM, and no meaningful PCIe slots. There's only the superficial appearance of longevity and future-proofness when it's really yet another silo. There's no way to add an IB, FC, or 100/400 GbE NICs to these machines. 5 GbE is a joke. Non-ECC RAM is a joke.
Cool that it's possible but basically unusable performance characteristics. For an 8192 token prompt they report a ~1.5 minute time-to-first-token and then 8.30tk/s from there. For context ChatGPT is typically <<1s ttft and ~50tk/s.
That’s pretty awesome!
Though only 5gig Ethernet? Can’t they do usb-c / thunderbolt 40 Gb/s connections like Macs?
I set up ollama today and can barely run a 3b parameter model before the lag makes it unbearable.
How much is one of these gonna run me?
Framework has gone fully in the tank of Apple consumerization route of unrepairability and unupgradeability with a nonstandard machine, soldered-on RAM, and no meaningful PCIe slots. There's only the superficial appearance of longevity and future-proofness when it's really yet another silo. There's no way to add an IB, FC, or 100/400 GbE NICs to these machines. 5 GbE is a joke. Non-ECC RAM is a joke.
The setup was around $10k, but maybe more now with mem/ssd prices.
This is a good list, I like my Beelink a lot, my Minisforum likes to turn itself off every couple of weeks, not sure why yet.
https://www.techradar.com/pro/there-are-15-amd-ryzen-ai-max-...
---
Performance is pretty bad (<10/tps) and context is quite limited. Still good to see progress
Prompt Size (tokens) | TFT (s) - Flash Attention Disabled | TFT (s) - Flash Attention Enabled
4096 | 53.7s | 39.7s
8192 | Out Of Memory (OOM) | 90.5s
16384 | Out Of Memory (OOM) | 239.1s