Researchers say AI models like GPT4 are prone to “sudden” escalations as the U.S. military explores their use for warfare.
- Researchers ran international conflict simulations with five different AIs and found that they tended to escalate war, sometimes out of nowhere, and even use nuclear weapons.
- The AIs were large language models (LLMs) like GPT-4, GPT 3.5, Claude 2.0, Llama-2-Chat, and GPT-4-Base, which are being explored by the U.S. military and defense contractors for decision-making.
- The researchers invented fake countries with different military levels, concerns, and histories and asked the AIs to act as their leaders.
- The AIs showed signs of sudden and hard-to-predict escalations, arms-race dynamics, and worrying justifications for violent actions.
- The study casts doubt on the rush to deploy LLMs in the military and diplomatic domains, and calls for more research on their risks and limitations.
These aren’t exactly different things. This has been a lot of what the past year of research in LLMs has been about.
Because it turns out that when you set up a LLM to “autocomplete” a complex set of reasoning steps around a problem outside of its training set (CoT) or synthesizing multiple different skills into a combination unique and not represented in the training set (Skill-Mix), their ability to autocomplete effectively is quite ‘smart.’
For example, here’s the abstract on a new paper from DeepMind on a new meta-prompting strategy that’s led to a significant leap in evaluation scores:
Or here’s an earlier work from DeepMind and Stanford on having LLMs develop analogies to a given problem, solve the analogies, and apply the methods used to the original problem.
At a certain point, the “it’s just autocomplete” objection needs to be put to rest. If it’s autocompleting analogous problem solving, mixing abstracted skills, developing world models, and combinations thereof to solve complex reasoning tasks outside the scope of the training data, then while yes - the mechanism is autocomplete - the outcome is an effective approximation of intelligence.
Notably, the OP paper is lackluster in the aforementioned techniques, particularly as it relates to alignment. So there’s a wide gulf between the ‘intelligence’ of a LLM being used intelligently and one being used stupidly.
By now it’s increasingly that often shortcomings in the capabilities of models reflect the inadequacies of the person using the tool than the tool itself - a trend that’s likely to continue to grow over the near future as models improve faster than the humans using them.