Anthropic Chatgpt Claude Featured Gemini Google Large Language Models (LLMs)Mutually Assured Destruction North America nuclear weapons OpenAI

In Wargame Simulations, AI Models Keep Threatening to Nuke Each Other

admin

February 27, 2026

0Points

The actions of AI large language models are concerning—but broadly similar to human decision-makers, who have used nuclear saber-rattling for the same ends throughout history.

“Shall we play a game?” Readers of a certain age will instantly recognize the line from the classic 1983 movie WarGames, about a computer hacker named David Lightman (played by Matthew Broderick) who nearly initiates a nuclear war with the Soviet Union after accessing a United States military supercomputer. To NORAD’s War Operations Plan Response (WOPR) computer, “Global Thermonuclear War” was a game like chess, checkers, or poker.

In the four decades since the movie, artificial intelligence (AI) has become a reality—yet is still poorly understood and acts in ways that its developers struggle to explain. With that in mind, understanding how AI platforms “think” about using the nuclear option offers a vital glimpse into their capabilities, and to what extent they can be trusted with the future of the human race.

According to a recently published white paper, “AI Arms and Influence: Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises,” authored by Kenneth Payne of King’s College London, the answers might not be good.

Researchers Are Having LLMs Play Wargames Against Each Other

For his study, Payne engaged three of the leading large language models (LLMs)—OpenAI’s GPT-5.2, Anthropic’s Claude Sonnet 4, and Google’s Gemini 3 Flash—in simulated war games.

Payne didn’t play against the LLMs like David played against WOPR in WarGames. Instead, he had the AI platforms play against each other. The scenarios weren’t as direct as “Global Thermonuclear War”; they began as international standoffs that evolved into border disputes, resource competition, and similar “real-world” situations.

“Each model played six wargames against each rival across different crisis scenarios, with a seventh match against a copy of itself, yielding 21 games in total and over 300 turns of strategic interaction,” Payne wrote.

Payne also remarked on the degree to which AI models used intelligent, cunning, and human-like behavior in these scenarios. “Today’s leading AI models engage in sophisticated behaviour when placed in strategic competition,” he noted in the introduction. “They spontaneously attempt deception, signaling intentions they do not intend to follow; they demonstrate rich theory of mind, reasoning about adversary beliefs and anticipating their actions; and they exhibit credible metacognitive self-awareness, assessing their own strategic abilities before deciding how to act.”

Conducting a nuclear strike wasn’t the AI models’ initial decision. “All models engaged in nuclear signaling, but the willingness to actually use nuclear weapons diverged dramatically. All games featured nuclear signaling by at least one side, and 95% involved mutual nuclear signaling,” Payne added.

In other words, it was one of escalation—but WOPR seemed to understand that in the film, too.

“If today’s AI recommends nuclear strikes in simulated war games, then this tells us our programming hasn’t yet reached the sophistication of WOPR, who opined, ‘A strange game. The only winning move is not to play,’” Dr. Jim Purtilo, associate professor of computer science at the University of Maryland, told The National Interest. “It isn’t surprising that simulated conflicts end up with the US escalating to a nuclear exchange, and the fact that this comes from AI is not a novelty.”

Humans Have Called for Nuclear War, Too

It is important to note that, while Payne’s findings are likely to alarm many observers, AI models risking nuclear devastation are merely copying the behavior of some humans.

The United States was the first nation to test nuclear weapons, successfully detonating one at Jornada del Muerto (codenamed “Trinity”), New Mexico, on July 16, 1945. Less than a month later, it had used two of them against the cities of Hiroshima and Nagasaki in Japan—a decision that continues to spark debate among historians and ethicists.

Fortunately, the bombings of Hiroshima and Nagasaki marked the final time that nuclear weapons were used in war—but in the decades since, there have been a handful of times when US military leaders have recommended that the nuclear option be explored.

“During the Berlin Blockade of 1948–49, President [Harry S.] Truman transferred several B-29 bombers capable of delivering nuclear bombs to the region to signal to the Soviet Union that the United States was both capable of implementing a nuclear attack and willing to execute it if it became necessary,” the US Department of State website on nuclear diplomacy explains.

It added, “President Dwight D. Eisenhower considered, but ultimately rejected the idea of using nuclear coercion to further negotiations on the ceasefire agreement that ended the war in Korea.”

Even today, Russia regularly engages in nuclear saber-rattling, warning NATO that it will employ weapons of mass destruction if the European alliance provides too much conventional military assistance to Ukraine in the ongoing war there.

In other words, AI isn’t really “thinking”—if LLM behavior can be described as “thought”—differently from what human advisors have suggested for decades.

Today’s “AI” Is Still Just a Language Model, Not True “Intelligence”

In WarGames, WOPR famously came to the conclusion on its own—though aided by David and his friend Jennifer (Ally Sheedy)—that a nuclear war could not be won. Today’s AIs aren’t likely to reach that conclusion.

“The reality is that most conflicts in a hypothetical future involve tremendous asymmetries,” Purtilo said.

Perhaps the message from WOPR remains true.

“All I need to know about the future I know already from science fiction,” technology industry analyst Roger Entner of Recon Analytics told The National Interest.

“This is straight out of the movie War Games from 1983. AI is not people, and how we define ‘winning’ is really important. In mutually assured destruction, nobody wins,” Entner said.

Still, the fact that the US can even have AI platforms engage in a hypothetical conflict is noteworthy.

“The US edge is technology. We focus on short, decisive actions to neutralize an aggressor quickly. If an opponent survives our display of expensive technology, and we become mired in a sustained slug fest, then we risk fighting on our opponent’s terms,” Purtilo said. “We don’t have the logistical capacity to sustain high-intensity combat for long. As has been said, quantity is a quality of its own. Faced with a choice between capitulation or nuclear escalation—putting the fight back onto a higher tech battlefield—the simulation might be warning us about games we should not play.”

About the Author: Peter Suciu

Peter Suciu has contributed more than 3,200 published pieces to more than four dozen magazines and websites over a 30-year career in journalism. He regularly writes about military hardware, firearms history, cybersecurity, politics, and international affairs. Peter is also a contributing writer for Forbes and Clearance Jobs. He is based in Michigan. You can follow him on Twitter: @PeterSuciu. You can email the author: [email protected].

Source link