Given the same fundamentals, such as transformer architecture networks, then multiple models given data about the same world are going to converge on representation as a matter of course. They're going to diverge if the underlying manner in which data gets memorized and encoded, such as with RNNs, like RWKV.
The interesting bits should be the convergence of representation between human brains and transformer models, or brains and RWKV, because the data humans collect is implicitly framed by human cognitive systems and sensors.
The words and qualia and principles we use in thinking about things and communicating and recording data are going to anchor all data in a fundamental ontological way that is inescapable, and therefore it's going to constrain the manner in which higher order extrapolations and derivations can be structured, and those structures are going to overlap with human constructs.
Given the same fundamentals, such as transformer architecture networks, then multiple models given data about the same world are going to converge on representation as a matter of course. They're going to diverge if the underlying manner in which data gets memorized and encoded, such as with RNNs, like RWKV.
The interesting bits should be the convergence of representation between human brains and transformer models, or brains and RWKV, because the data humans collect is implicitly framed by human cognitive systems and sensors.
The words and qualia and principles we use in thinking about things and communicating and recording data are going to anchor all data in a fundamental ontological way that is inescapable, and therefore it's going to constrain the manner in which higher order extrapolations and derivations can be structured, and those structures are going to overlap with human constructs.