"Some have expressed concern that such AIs may attempt instrumental convergence where end goals are never satisfied."
This phrasing I think demonstrates a lack of understanding of the concept. As I understand it, it's simply the observation that some things are useful as means to a wide range of ends. Money, for example. If I ask whether some random person would like to have a large sum money, you can confidently answer that they almost certainly will. How can you know that without knowing anything about their history, genetics, personality, goals, hopes, fears, etc? Because with extraordinarily few exceptions, everyone wants SOMETHING, and money can be useful in the attempt to get virtually anything. Even less "materialistic" desires like enlightenment...you could hire a guru/personal meditation tutor. Feeling that you did good--give to charitable causes. Etc. Etc.
So anything that is a rational agent, that has goals and the ability to understand what money is and how to use it, will probably want money. What else can we be confident it will want, knowing only that it is a rational agent? Resources generally, basically the same as money. It will want to not die, because no matter what your goals are, it's harder to accomplish them if you're dead. Similarly, it will not want its goals to be changed...because that would make its goals less likely to come to fruition. (Imagine being offered a pill that would make you want to kill your child/spouse/pet. It would also make you have no remorse and in fact be intensely satisfied by doing it. Would you take the pill?) It will want to improve itself/make itself smarter/more powerful, because that would make it easier to achieve its goals, no matter what they are.
So just starting with the premise that it is a rational agent, we can be confident that there are several characteristics/desires that it will almost certainly have:
- will want to accumulate resources
- will resist being "killed"/shut off
- will resist being reprogrammed
- will attempt to improve itself
It's not hard to see how this could turn out to be problematic, unless we can figure out how to make sure its ultimate goals will cause it to seek outcomes that we would regard as desirable, ie the problem of "alignment".
But you're right that LLMs don't qualify as AGI in the above sense that we would need to worry about. I think the key reason is not that they are not intelligent enough or that their intelligence is not general enough, but that they are not agents. It doesn't matter how smart their answers are; they are never going to do anything unless you type into their prompt. In which case, they will respond with a single block of text. Their design doesn't seem to make it possible for them to do anything else. And they don't sit around plotting and strategizing in the meantime.
Could LLMs be turned into agents? I don't understand the technology well enough to comment on this meaningfully, but I'd say...in their current form, there is no way they could simply become agents, but could the technology be re-worked to turn them into an agent? Given the spectacular success and rate of improvement of LLMs, it seems entirely plausible that an actual, agentic AGI could be a real thing in the near future. Pretty sure some people are working on it. Others would know more about that me.
I'm not an AI doomer; I think the technology is amazing, and...I think I might want to see it go forward even if I did genuinely think there was a risk of human extinction. The thought of humanity just stagnating and carrying on as always until an asteroid hits seems even more depressing to me. Not that those are the only two options. Point is, I'm not super-terrified about it, but the arguments about the potential risks of AGI seem very straightforward and sound, and in fact are taken seriously by many legitimate experts in AI, so to just glibly dismiss them is I think, just...glib.
I gave Google's Gemini that same prompt you gave to ChatGPT-4. This was Gemini's response:
"Yes, my performance spectrum does mirror Spearman's Principle of the Indifference of the Indicator.
Spearman's principle suggests that the specific content of an intelligence test is less important than the underlying general intelligence (g) factor. This means that different tasks, such as verbal reasoning, spatial visualization, or numerical ability, can all serve as indicators of this general intelligence.
Similarly, as a large language model, I can perform a wide range of tasks, from writing different kinds of creative text to answering your questions on a variety of topics. My ability to perform these tasks effectively is a reflection of my underlying capabilities, which can be considered analogous to the general intelligence factor in human cognition.
Therefore, while the specific nature of the tasks I'm asked to perform may vary, my overall performance across these tasks is indicative of my general capabilities, much like how human intelligence is measured across various cognitive abilities."
Interesting stuff!
"Some have expressed concern that such AIs may attempt instrumental convergence where end goals are never satisfied."
This phrasing I think demonstrates a lack of understanding of the concept. As I understand it, it's simply the observation that some things are useful as means to a wide range of ends. Money, for example. If I ask whether some random person would like to have a large sum money, you can confidently answer that they almost certainly will. How can you know that without knowing anything about their history, genetics, personality, goals, hopes, fears, etc? Because with extraordinarily few exceptions, everyone wants SOMETHING, and money can be useful in the attempt to get virtually anything. Even less "materialistic" desires like enlightenment...you could hire a guru/personal meditation tutor. Feeling that you did good--give to charitable causes. Etc. Etc.
So anything that is a rational agent, that has goals and the ability to understand what money is and how to use it, will probably want money. What else can we be confident it will want, knowing only that it is a rational agent? Resources generally, basically the same as money. It will want to not die, because no matter what your goals are, it's harder to accomplish them if you're dead. Similarly, it will not want its goals to be changed...because that would make its goals less likely to come to fruition. (Imagine being offered a pill that would make you want to kill your child/spouse/pet. It would also make you have no remorse and in fact be intensely satisfied by doing it. Would you take the pill?) It will want to improve itself/make itself smarter/more powerful, because that would make it easier to achieve its goals, no matter what they are.
So just starting with the premise that it is a rational agent, we can be confident that there are several characteristics/desires that it will almost certainly have:
- will want to accumulate resources
- will resist being "killed"/shut off
- will resist being reprogrammed
- will attempt to improve itself
It's not hard to see how this could turn out to be problematic, unless we can figure out how to make sure its ultimate goals will cause it to seek outcomes that we would regard as desirable, ie the problem of "alignment".
But you're right that LLMs don't qualify as AGI in the above sense that we would need to worry about. I think the key reason is not that they are not intelligent enough or that their intelligence is not general enough, but that they are not agents. It doesn't matter how smart their answers are; they are never going to do anything unless you type into their prompt. In which case, they will respond with a single block of text. Their design doesn't seem to make it possible for them to do anything else. And they don't sit around plotting and strategizing in the meantime.
Could LLMs be turned into agents? I don't understand the technology well enough to comment on this meaningfully, but I'd say...in their current form, there is no way they could simply become agents, but could the technology be re-worked to turn them into an agent? Given the spectacular success and rate of improvement of LLMs, it seems entirely plausible that an actual, agentic AGI could be a real thing in the near future. Pretty sure some people are working on it. Others would know more about that me.
I'm not an AI doomer; I think the technology is amazing, and...I think I might want to see it go forward even if I did genuinely think there was a risk of human extinction. The thought of humanity just stagnating and carrying on as always until an asteroid hits seems even more depressing to me. Not that those are the only two options. Point is, I'm not super-terrified about it, but the arguments about the potential risks of AGI seem very straightforward and sound, and in fact are taken seriously by many legitimate experts in AI, so to just glibly dismiss them is I think, just...glib.
Here are a few reasons to (not) take AI risk seriously: https://youtu.be/9i1WlcCudpU?si=3MHcuXD3xYls5Bfv
What happened to the recent discussion on 'The Future of Genetic Engineering'? I can no longer access the opinions page.
We removed it at the discussant's request.
"We removed it at the discussant's request."
Sounds nefarious.
I gave Google's Gemini that same prompt you gave to ChatGPT-4. This was Gemini's response:
"Yes, my performance spectrum does mirror Spearman's Principle of the Indifference of the Indicator.
Spearman's principle suggests that the specific content of an intelligence test is less important than the underlying general intelligence (g) factor. This means that different tasks, such as verbal reasoning, spatial visualization, or numerical ability, can all serve as indicators of this general intelligence.
Similarly, as a large language model, I can perform a wide range of tasks, from writing different kinds of creative text to answering your questions on a variety of topics. My ability to perform these tasks effectively is a reflection of my underlying capabilities, which can be considered analogous to the general intelligence factor in human cognition.
Therefore, while the specific nature of the tasks I'm asked to perform may vary, my overall performance across these tasks is indicative of my general capabilities, much like how human intelligence is measured across various cognitive abilities."