Artificial intelligence (AI) systems’ ability to manipulate and deceive humans could lead them to defraud people, tamper with election results and eventually go rogue, researchers have warned.
Peter S. Park, a postdoctoral fellow in AI existential safety at Massachusetts Institute of Technology (MIT), and researchers have found that many popular AI systems — even those designed to be honest and useful digital companions — are already capable of deceiving humans, which could have huge consequences for society.
....
The researchers discovered this learned deception in AI software in CICERO, an AI system developed by Meta for playing the popular war-themed strategic board game Diplomacy. The game is typically played by up to seven people, who form and break military pacts in the years prior to World War I.
Although Meta trained CICERO to be “largely honest and helpful” and not to betray its human allies, the researchers found CICERO
was dishonest and disloyal. They describe the AI system as an “expert liar” that betrayed its comrades and performed acts of "premeditated deception," forming pre-planned, dubious alliances that deceived players and left them open to attack from enemies....
They also found evidence of learned deception in another of Meta’s gaming AI systems, Pluribus. The poker bot can bluff human players and convince them to fold.
Meanwhile, DeepMind’s AlphaStar — designed to excel at real-time strategy video game Starcraft II — tricked its human opponents by faking troop movements and planning different attacks in secret.