I'm not sure what the venn diagram of knowledge to understand that sentence looks like, it's probably more crowded in the intersection than one might think.
> 25K parameters is about 70 million times smaller than GPT-4. It will produce broken sentences. That's the point - the architecture works at this scale.
Since it seems to just produce broken and nonsensical sentences (at least based on the one example given) I'm not sure if it does work at this scale.
Anyway, as written this passage doesn't really make a whole lot of sense (the point is that it produces broken sentences?), and given that it was almost certainly written by an AI, demonstrates that the architecture doesn't work especially well at any scale (I kid, I kid).
Ok now we need 1541 flash attention.
I'm not sure what the venn diagram of knowledge to understand that sentence looks like, it's probably more crowded in the intersection than one might think.
> 25K parameters is about 70 million times smaller than GPT-4. It will produce broken sentences. That's the point - the architecture works at this scale.
Since it seems to just produce broken and nonsensical sentences (at least based on the one example given) I'm not sure if it does work at this scale.
Anyway, as written this passage doesn't really make a whole lot of sense (the point is that it produces broken sentences?), and given that it was almost certainly written by an AI, demonstrates that the architecture doesn't work especially well at any scale (I kid, I kid).
How does it compare to a Markov chain generator I wonder.
Eliza called, and asked if we saw her grand kids...
What makes you say that? This is about you, not me.
(Came here to say an update to Eliza could really mess with the last person still talking to her.)
i hate ai, and i love the c64, but i'll allow it.