It never “intentionally” lies
Here’s an exciting ‘AI can do that now’ moment: Meta’s latest AI, Cicero, can beat human players at classic negotiation and betrayal game Diplomacy. While playing online atwebDiplomacy.net, it’s achieved “more than double the average score of human players”, ranking “in the top 10 percent of participants who played more than one game”. It can figure out who needs persuading to do what, then engage with those players using impressive and effective natural language.
I won’t do a ‘taking over the world’ joke. I won’t.
Diplomacy is a stripped back board game where players compete for domination of Europe in a free-for-all version of WW1. Every turn you manoeuvre a small number of armies around the board, but more importantly, you make alliances. You tell Geoff you need to band together against Margret’s Germany, agree to support his troops into Berlin, then secretly swap your support to Margaret because she’s promised to help you storm through Paris. Diplomacy is, as Meta’sresearch blog postputs it, “a game about people rather than pieces”.
Savvy manoeuvring helps, of course, and that’s a strategic domain where advanced AI’s skills uncontroversially trump those of humans - one which Meta will of course play down. Nevertheless, it’s still a game where you need to convince people to cooperate with you, and cicero can do just that.
More specifics can be found on Meta’s blog post and the team’sresearch paper, but you can jump straight to the most impressive bits by looking at research scientist Mike Lewis’stwitter thread.
Each game, it sends and receives hundreds of messages, which must be precisely grounded in the game state, dialogue history, and its plans. We developed methods for filtering erroneous messages, letting the agent to pass for human in 40 games. Guess which player is AI here… 4/5pic.twitter.com/8IMuepL7yf
Meta’s blog post does get into the nitty gritty of what makes Cicero tick, which is pretty interesting. Rather than improving solely through supervised learning, where an AI trains on “labeled data such as a database of human players’ actions in past games”, Cicero makes predictions and tries to stick to them:
Another Tweetfrom Lewisexpands on that, saying Cicero is “designed to never intentionally backstab” but that “sometimes it changes its mind…”.
Meta suggest one future application for an AI like Cicero could be creating videogame NPCs that talk realistically while understanding your motives. Maybe we really will get to talk to the monsters.