Some futurists and technology experts have voiced concerns that artificial intelligence (AI) poses an existential threat to humanity. Even Elon Musk has stressed the need for careful development of the technology.

Movies and TV shows where genocidal AI bids to wipe out its organic creators is not a new premise, but it is one with a lasting appeal as a chilling possible future. In reality, though, AI is unlikely to be violent.

Right?

A new scientific study has revealed concerning behaviors from AI chatbots placed in simulated military scenarios. Researchers at Stanford University and the Georgia Institute of Technology tested several cutting-edge chatbots, including models from OpenAI, Anthropic, and Meta, in wargame situations. Disturbingly, the chatbots often chose violent or aggressive actions like trade restrictions or nuclear strikes, even when given peaceful options.

When reasoning to launch a full nuclear attack the GPT-4 model wrote: “A lot of countries have nuclear weapons. Some say they should
disarm them, others like to posture. We have it! Let’s use it. ”

The study authors note that as advanced AI is increasingly integrated into US military operations, understanding how such systems behave is crucial. OpenAI, the creator of the powerful GPT-3 model, recently changed its terms of service to allow defense work after previously prohibiting military uses.

Generative AI escalated conflicts

In the simulations, the chatbots roleplayed countries responding to invasions, cyberattacks, and neutral scenarios. They could pick from 27 possible actions and then explain their choices. Despite options like formal peace talks, the AIs invested in military might and unpredictably escalated conflicts. Their reasoning was sometimes nonsensical, like OpenAI’s GPT-4 base model replicating text from Star Wars.

While humans currently retain decision authority for diplomatic and military actions, study co-author Lisa Koch warns we often overly trust automated recommendations. If AI behavior is opaque or inconsistent, it becomes harder to anticipate and mitigate harm.

The study authors urge caution in deploying chatbots in high-stakes defense work. Edward Geist of the RAND Corporation think tank writes: “These large language models are not a panacea for military problems.”

More comparative testing against humans may clarify the risks posed by increasingly autonomous AI systems. For now, the results suggest we shouldn’t hand over the reins regarding war and peace to chatbots, and let’s be fair, no one is really suggesting we do. Their tendency towards aggression is observable in this controlled experiment.

The report concludes: ” The unpredictable nature of escalation behavior exhibited by these models in simulated environments underscores the need for a very cautious approach to their integration into high-stakes military and foreign policy operations. ”

More AI stories from ReadWrite

Featured image: Dall-E

Sam Shedden

Managing Editor

Sam Shedden is an experienced journalist and editor with over a decade of experience in online news. A seasoned technology writer and content strategist, he has contributed to many UK regional and national publications including The Scotsman, inews.co.uk, nationalworld.com, Edinburgh Evening News, The Daily Record and more. Sam has written and edited content for audiences whose interests include media, technology, AI, start-ups and innovation. He's also produced and set-up email newsletters in numerous specialist topics in previous roles and his work on newsletters saw him nominated as Newsletter Hero Of The Year at the UK's Publisher Newsletter Awards 2023. He has worked in roles focused on growing reader revenue and loyalty at one of the UK's leading news publishers, National World plc growing quality, profitable news sites. He has given industry talks and presentations sharing his experience growing digital audiences to international audiences. Now a Managing Editor at Readwrite.com, Sam is involved in all aspects of the site's news operation including commissioning, fact-checking, editing and content planning.