62
A prevailing sentiment online is that GPT-4 still does not understand what it talks about. We can argue semantics over what “understanding” truly means. I think it’s useful, at least today, to draw the line at whether GPT-4 has succesfully modeled parts of the world. Is it just picking words and connecting them with correct grammar? Or does the token selection actually reflect parts of the physical world?
One of the most remarkable things I’ve heard about GPT-4 comes from an episode of This American Life titled “Greetings, People of Earth”.
The bit you quoted is referring to training.
Recent papers say otherwise.
The conclusion the author of that article comes to (LLMs can understand animal language) is… problematic at the very least. I don’t know how they expect that to happen.
In what sense does your link say otherwise? Is a world model the same thing as intelligence?
In the end of the bit I quoted you say: “basically no world at all.” But also, can you define what intelligence is? Are you sure it isn’t whatever LLMs are doing under the hood, deep in hidden layers? I guess having a world model is more akin to understanding than intelligence, but I don’t think we have a great definition of either.
Edit to add: More… papers…
From the Encyclopedia Britannica:
In no sense do LLMs do any of these except, perhaps, “understand and handle abstract concepts.” But since they themselves have no understanding of the concepts, and merely generate text that can simulate understanding, I would call that a stretch.
Yes. LLMs are not magic, they are math, and we understand how they work. Deep under the hood, they are manipulating mathematical vectors that in no way are connected representationally to words. In the end, the result of that math is reapplied to a linguistic model and the result is speech. It is an algorithm, not an intelligence.
I’m not really interested in papers that either don’t understand LLMs or play word games with intelligence (shockingly, solipsism is an easy point of view to believe if you just ignore all evidence). For every one of these, you can find a dozen that correctly describe ChatGPT and its limitations. Again, including ChatGPT itself. Why not believe those instead of cherry-pick articles that gratify your ego?
I mean, my first paper was from Max Tegmark. My second paper was from Microsoft. You are discounting a well known expert in the field and one of the leading companies working on AI as not understanding LLMs.
I note that’s the definition for “human intelligence.” But either way, sure, LLMs alone can’t learn from experience (after training and between multiple separate contexts), and they can’t manipulate their environment. BabyAGI, AgentGPT, and similar things can certainly manipulate their environment using LLMs and learn from experience. LLMs by themselves can totally adapt to new situations. The paper from Microsoft discusses that. However, for sure, they don’t learn the way people do, and we aren’t currently able to modify their weights after they’ve been trained (well without a lot of hardware). They can certainly do in-context learning.
We understand how they work? From the Wikipedia page on LLMs:
It goes on to mention a couple things people are trying to do, but only with small LLMs so far.
Here’s a quote from Anthropic, another leader in AI:
They’re working on trying to understand LLMs, but aren’t there yet. So, if you understand how they do what they do, then please let us know! It’d be really helpful to make sure we can better align them.
Is this not what word/sentence vectors are? Mathematical vectors that represent concepts that can then be linked to words/sentences?
Anyway, I think time will tell here. Let’s see where we are in a couple years. :)
You are misunderstanding both this and the quote from Anthropic. They are saying the internal vector space that LLMs use is too complicated and too unrelated to the output to be understandable to humans. That doesn’t mean they’re having thoughts in there: we know exactly what they’re doing inside that vector space – performing very difficult math that seems totally meaningless to us.
The vectors do not represent concepts. The vectors are math. When the vectors are sent through language decomposition they become words, but they were never concepts at any point.
Yes, that’s exactly what I’m saying.
I mean. Not in the way we do, and not with any agency, but I hadn’t argued either way on thoughts because I don’t know the answer to that.
Huh? We know what they are doing but we don’t? Yes, we know the math, people wrote it. I coded my first neural network 35 years ago. I understand the math. We don’t understand how the math is able to do what LLMs do. If that’s what you’re saying then we agree on this.
“The neurons are cells. When neurotransmitters are sent through the synapses, they become words, but they were never concepts at any point.”
What do you mean by “they were never concepts”? Concepts of things are abstract. Nothing physical can “be” an abstract concept. If you think about a chair, there isn’t suddenly a physical chair in your head. There’s some sort of abstract representation. That’s what word vectors are. Different from how it works in a human brain, but performing a similar function.
From this page. Or better still, this article explaining how they are used to represent concepts. Like this is the whole reason vector embeddings were invented.
We do understand how the math results in LLMs. Reread what I said. The neural network vectors and weights are too complicated to follow for an individual, and do not relate on a 1:1 mapping with the words or sentences the LLM was trained on or will output, so individuals cannot deduce the output of an LLM easily by studying its trained state. But we know exactly what they’re doing conceptually, and individually, and in aggregate. Read your own sources from your previous post, that’s what they’re telling you.
Concepts are indeed abstract but LLMs have no concepts in them, simply vectors. The vectors do not represent concepts in anything close to the same way that your thoughts do. They are not 1:1 with objects, they are not a “thought,” and anyway there is nothing to “think” them. They are literally only word weights, transformed to text at the end of the generation process.
Your concept of a chair is an abstract thought representation of a chair. An LLM has vectors that combine or decompose in some way to turn into the word “chair,” but are not a concept of a chair or an abstract representation of a chair. It is simply vectors and weights, unrelated to anything that actually exists.
That is obviously totally different in kind to human thought and abstract concepts. It is just not that, and not even remotely similar.
You say you are familiar with neural networks and AI but these are really basic underpinnings of those concepts that you are misunderstanding. Maybe you need to do more research here before asserting your experience?
Edit: And in relation to your links – the vectors do not represent single words, but tokens, which indeed might be a whole word, but could just as well be part of a word or an entire phrase. Tokens do not represent the meaning of a word/partial word/phrase, just the statistical use of that word given the data the word was found in. Equating these vectors with human thoughts oversimplifies the complexities inherent in human cognition and misunderstands the limitations of LLMs.
Just so incredibly wrong. Fortunately, I’ll have save myself time arguing with such a misunderstanding. GPT-4 is here to help:
Can you define and give examples of what you mean at each level here? Maybe we’re just not understanding each other and mean the same thing.
The Anthropic one is saying they think they have a way to figure it out, but it hasn’t been tested on large models. This is their last paragraph:
They are literally only able to do this on a small one layer transformer model. GPT 3 has 96 layers and 175 billion parameters.
Also, in their linked paper:
Under the Future Work heading:
How are you getting from that that this is a solved problem?
Again, you aren’t making sense here. Word/sentence vectors are literally a way to represent the concept of those words/sentences. That’s what they were built for. That’s how they are described. Let’s take a step back to try to understand each other.
Are you trying to say that only human minds can understand concepts? I don’t buy the human brains are magic bit, and neither does our current understanding of physics. Are you assuming I’m saying that LLMs are sentient, conscious, have thoughts or similar? I’m not. Jury’s out on the thought thing, but I certainly don’t believe the other two things. There’s no magic with them, same with human brains. We just don’t fully understand what happens inside either. Anthropic in the work I quoted is making good progress at that, and I think they may be pretty close, but in terms of LLMs (and not Small LMs), they are still a black box. We know the math behind them, the software, etc. We have some theories. We still do not understand. If you can prove otherwise, please provide me with a source. Stuff is happening really fast in AI, and maybe I blinked and missed something.
I think you’re maybe having a hard time with using numbers to represent concepts. While a lot less abstract, we do this all the time in geometry. ((0, 0), (10, 0), (10, 10), (0, 10), (0, 0)) What’s that? It’s a square. Word vectors work differently but have the same outcome (albeit in a more abstract way).
I was talking word vectors where the vectors DO represent words. It’s in the name. LLMs don’t specifically use word vectors, but the embeddings they do use work similarly.
You are correct tokens don’t represent the meaning of a word. However, tokens are scalars. You are conflating tokens and embeddings / word vectors here. Tokens are used to simplify converting a string into a format a neural network can understand (a vector). If we used each ascii character in the input/output string as a vector input to the network, we’d have to have a lot more parameters than if we combine the characters in some way (i.e. tokens). As you said, they can be a word or a part of a word. There’s no statistics embedded with the tokens (there are some methods of using statistics to choose what tokens to use, but that’s decided before even training the model and can not ever change [with our current approach]). You can read here for more information on tokens. Or you can play around with the gpt3 tokenizer.
If you know Python, you should grab nltk and experiment with gensim, their word vectors.
king + woman - man = queen
Seems like an abstract representation of those things as concepts using math. For the record, word vectors are actually pretty understandable/understood by people because you can visualize them easily. When you do, you find similar concepts clustered together (this is how vector search works except with text embeddings). Anyway, it just really seems like linking numbers to concepts is not clicking with you, or you somehow think it’s not possible. Reading up on computational linguistics might help.
Yes, neural networks (although initially built thinking they were a computer version of a neuron), are a lot different from how actual brains work as we’ve learned in however many decades since they were invented. If you’re saying that intelligence and understanding is limited to the human mind, then please point to some non-religious literature that backs up your assertion.
I’m pretty confident in my understanding, though I’m always open to new ideas that are backed with peer reviewed research. I’m not going to get into a dick waving contest here, so I guess we’ll have to agree to disagree.
As a side note, going back to your definition of intelligence. That was for psychology. I’ll note that the Wikipedia page for Intelligence has this to say:
And so I’ll reiterate that we don’t have a good definition of intelligence.
You really, truly don’t understand what you’re talking about.
If this community values good discussion, it should probably just ban statements that manage to be this wrong. It’s like when creationists say things like “if we came from monkeys why are they still around???”. The person has just demonstrated such a fundamental lack of understanding that it’s better to not engage.
Oh, you again – it’s incredibly ironic you’re talking about wrong statements when you are basically the poster child for them. Nothing you’ve said has any grounding in reality, and is just a series of bald assertions that are as ignorant as they are incorrect. I thought you would’ve picked up on it when I started ignoring you, but: you know nothing about this and need to do a ton more research to participate in these conversations. Please do that instead of continuing to reply to people who actually know what they’re talking about.
What research have you done?
How would creating a world model from scratch not involve intelligence?
It’s not from scratch, it’s seeded and trained by humans. That is the intelligence.
Just like humans are! Do you know what happens when a human grows up without any training by other humans? They are essentially feral, unable to communicate, maybe even unable to think the way we do.
LLMs do not grow up. Without training they don’t function properly. I guess in this aspect they are similar to humans (or dogs or anything else that benefits from training), but that still does not make them intelligent.
What does it mean to “grow up”? LLMs get better at their tasks during training, just as humans do while growing up. You have to clearly define the terms you use.
You used the term and I was using it with the same usage you were. Why are you quibbling semantics here? It doesn’t change the point.
Yes, I used the term because “growing up” has a well-defined meaning with humans. It doesn’t with LLMs, so I didn’t use it with LLMs.
From scratch in the sense that it starts with random weights, and then experiences the world and builds a model of it through the medium of human text. That’s because text is computationally tractable for now, and has produced really impressive results. There’s no inherent need for text to be used though, similar models have been trained on time series data, and it will soon be feasible to hook up one of these models to a webcam and a body and let it experience the world on its own. No human intelligence required.
Also, your point is kind of silly. Human children learn language from older humans, and that process has been recursively happening for billions of years, all the way through the first forms of life. Do children not have intelligence? Or are you positing some magic moment in human evolution where intelligence just descended from the heavens and blessed us with it?