The U.K.'s University of Reading announced, with great pomp, "a milestone in computing history": A computer has for the first time passed a correctly staged Turing test, deceiving judges into thinking they were chatting with a human. Computers are not taking over the world anytime soon, however: The victorious chatbot is almost as primitive as similar ones I tested 15 years ago.
Alan Turing, the originator of computer science, outlined his test in the 1950 work "Computing Machinery and Intelligence". He argued that the question: "Can machines think" was "too meaningless to deserve discussion," and that to most practical purposes it could be replaced with an "imitation game" test in which an interrogator talking to a computer and a human at the same time has to tell which is which. The test is passed if more than 30 percent of the interrogators are deceived.
The chatbot that tricked 33 percent of the judges at Reading is called Eugene Goostman, and he's supposed to be the 13-year-old son of a gynecologist from Odessa, Ukraine. The bot was developed as a hobby by Russian Vladimir Veselov and Ukrainian Eugen Demchenko. "Our main idea is that he can claim that he knows anything, but his age also makes it perfectly reasonable that he doesn't know everything," Veselov said in a prepared statement after Eugene's victory. "We spent a lot of time developing a character with a believable personality."
Five bots took part in the test. One of the judges, British actor and tech enthusiast Robert Llewellyn, tweeted: "Turing test was amazing. Did 10 sessions of 5 minutes, 2 screens, 1 human 1 machine. I guessed correctly 4 times. Clever little robot fellow."
If it was Eugene Gootsman that beat Llewellyn, I can't think how he did it. I visited the Princeton artificial intelligence site, where anyone can talk to the bot, to discuss the Ukrainian crisis with Eugene. This is how our conversation went (I did not correct Eugene's spelling and grammar):
LB: Where do you live?
EG: I am from Ukraine, from the city called Odessa. You might have heard about it.
LB: I have. Were you there on May 2?
EG: Was I? I would rather not talk about it if you don't mind. Wonna ask me something more?
LB: I want to talk about this. Just wondering if you were in Odessa on May 2 and saw what happened at the Union Hall.
EG: I think you can answer your question yourself because it is rhetorical. Could you tell me about your job, by the way?
LB: I am a journalist.
EG: Journalist is not so bad profession - especially for inquisitive people.
LB: I am inquisitive. So did you see the fire at Union Hall on May 2?
EG: What a stupid questions you ask! Did I forget it or you really haven't told me about the place you are from?
EG: Leonid? Hi pal! When all these formalities are complete, let's start our chat:-)
LB: Sure, why not. So, about that fire in Odessa on May 2: Did you see it?
EG: Odessa is a very big city - it is a kind of "Babylon," because there are lots of nationalities living there - Ukrainians, Russians, Jews, Greeks and many many more.
On May 2, of course, Odessa saw its bloodiest clashes since World War II: More than 40 pro-Russian demonstrators died in the city's Union Hall after it was set on fire with Molotov cocktails – their own and those hurled by pro-Ukrainian soccer fans and activists. I'm willing to accept a real-life "Eugene Goostman" refusing to talk about it because of post-traumatic stress, but Eugene the bot simply had no clue about the events and kept trying to change the subject.
He does have an endearing Russian "accent," which probably didn't cost his developers too many sleepless nights, and he makes typos and errors almost like a human, but that is not enough for a convincing performance in Turing's imitation game. To truly pass the test, the artificial intelligence is supposed to answer questions in a human-like, more or less logical way, not just react to isolated words as Eugene does. Ideally, it should also be able to use search engines to look at least superficially aware of what's going on in the world, but that's probably too much to ask.
All Eugene can do is resort to question-dodging tactics, exactly like the first chatbots that became popular in the late 1990's. When I played with them, they, too, considered every question that didn't end in a question mark "rhetorical" and suggested "talking about something else." Eugene was first created in 2001 and did well in subsequent Turing test competitions, but he has not progressed much in terms of human-like conversational skills.
I had more fun talking to another famous chatbot, ALICE. She, unlike Eugene, was actually in Odessa on May 2:
ALICE: Hi there!
LB: Were you in Odessa on May 2?
ALICE: I was.
LB: Did you see the fire at Union Hall?
ALICE: Sorry I can't see anything right now. My eye is off.
LB: Was it off on May 2 also?
ALICE: It could be.
LB: But was it actually?
ALICE: Is this a "yes or no" question?
The tempting yet frightening reality of the movie Her remains the stuff of Hollywood dreams. Computers are still atrocious at processing natural languages, which is a challenge for programmers and linguists -- and a relief to those of us who use language as our primary working tool.
To contact the writer of this article: Leonid Bershidsky at firstname.lastname@example.org.
To contact the editor responsible for this article: Marc Champion at email@example.com.