It is possible that by decoding language, we are decoding reality. Not just because language reflects reality in a general way, but if it is true that the columns in the neocortex all do the same thing, and that some of them are responsible for creating, emitting, and receiving language expressions, then it is possible that the structure of language reflects the structure of those columns, which also reflect the outside world. There seems to be an algorithm running inside them which runs no matter where the input came from, or what modality it came in. In other words, it is the master algorithm. So then, using meaningful language is very important in itself, as this tethers a machine learning model more closely to the world, but it is also a means to an end; the end is to understand how the structure of language could reflect the structure of perception and pattern recognition. This would explain why we are so perplexed by chatGPT. There is a kind of a tautological torture or tease (not sure which) in being puzzled by the structure and nature of the very thing you are trying to analyze. Language reflects the world around us. And because we are part of that world, and because we move through it and see and touch it, which are parallel processes all pointing to the same reality, it stands to reason that the worldview that is emergent from language would reflect the features of that world. And features include physics.
Opposites. Someone facing me; something on the opposite side of the room.
Proximity. Moving closer to something.
Momentum and change of direction. “It’s true that I’m a good cook now, but –” A change of direction is imminent. Whether it’s “I used to be terrible at it” (which is a 180° change in direction) or “I wasn’t always”, which is a contrast to the present state of my culinary prowess. “Not everyone agrees with that assessment” is another candidate. No matter what it is, it is a different direction. The sentence had a certain momentum in one direction, but the ‘but’ changes it.
Spatial and Temporal
Think of counting out apples at the supermarket. If I want to buy five apples I buy apple number one, number two, number three, number four, and number five. I don’t get apples zero through four. But if I say, “I’ll see you in five minutes” it really means I’m starting at zero. In this sense the physical world is more present than the temporal world. And because we live in the physical world, we are more used to it. The very idea of counting came first from counting things, not from counting time. People were able to pick apples a lot earlier than they were able to tell time to the second or the minute.
Why is this important?
Because the things we find most fundamental in our world experience are what we are most used to, and that is what we perceive so easily that it is almost unconscious. In fact, it most often is unconscious. I don’t think about the fact that I am counting when I am actually out shopping for apples. Or for keeping track of a football score. Or, for that matter, using language. The spatial is easier for us to examine than the temporal.
Similarly, we move in the physical world, and this will be reflected in our language. Where are you going with this idea? Are you getting it? Does that make sense? These three sentences mean pretty much the same thing, but two of them have physical correlations. ‘Where are you going with this’ is directional. I don’t know if it is more fundamental or less fundamental than just saying, what are you talking about? Or is this making sense? These sentences are a level of abstraction up from the physical world. In that sense they are one step removed from it.
Why does this matter?
Because we are talking about abstraction and the nature of abstraction, and the process of abstraction is key to manipulating and discovering artificial general intelligence. It cannot be based only on the next word in the sentence prediction, even though that predictive process may lead to surprisingly accurate ‘meaningful’ results. Imagine that I am cooking in the kitchen, and following a recipe, and that you’re watching me and trying to predict my movements. Left? Right? The sink? The second drawer on the right that holds the spatulas and cooking spoons? The trash can? The oven? But if you were following along with methe recipe I’m reading, and you see that I am checking that blueberries need to be added to the scones after I’ve grated in the frozen butter, but before it goes into the oven, and you see that I am rolling this giant ball of flour and sugar and egg and vanilla, then when I read that I’m going to need to put in the blueberries now, that’s when I’m going to go get the blueberries. They’re in the freezer. If you’re trying to predict where I’m going to move in the kitchen next without that information, you may get it right. But if you do, it’s based on a lower probability, not on the fact that I’m following this particular recipe. This, in a nutshell, is the problem of relying only on the NWITS model.
The problem of abstracting meaning, which is really extracting meaning, ironically, leads to higher levels of abstraction. Abstraction is meaning. It is operationally defined as as such. The question is: how do you get from words to the complex composed ideas that words expressed. Clumps of meaning are the currency of our communication with one another, and many flow from the physical world. The next word in the sentence is not.