Dispersion loss counteracts embedding condensation in small language models

(chenliu-1996.github.io)

20 points | by E-Reverance  3 hours ago

5 comments