Facebook Creates Intelligence Test for AI as an Alternative to the Turing Test
While the Turing Test was undeniably groundbreaking at the time of its conception, the problems inherent to it have been well-documented. The test, which requires a robot to "pass" as a human, encourages skills in AI that have no other use, such as deception and the ability to "dumb itself down" to mimic the limitations of a human. It's also simply inadequate for evaluating AI, as it relies on trickery rather than true intelligence. Other alternatives have been proposed, such as the Lovelace Test, but now Facebook has thrown their hat into the ring, as they have created an intelligence test for AI that could serve as a substitute for the Turing Test.
Facebook has a vested interest in artificial intelligence, as it uses the technology for certain functionalities such as filtering customers' news feeds. "People have a limited amount of time to spend on Facebook, so we have to curate that somehow," Yann LeCun, Facebook's director of AI research, told New Scientist. "For that you need to understand content and you need to understand people."
For the test, the AI has to answer 20 questions which become progressively more difficult and require more sophisticated thinking. Some of the easiest questions test simple reading comprehension, such as "Kate is in the living room. Tess is in the kitchen. Where is Kate?" More difficult examples include meta-analysis of the questions, such as counting the objects mentioned in a paragraph, or spatial reasoning, such as whether one object could fit inside another.
"We wanted tasks that any human who can read can answer," said Facebook's Jason Weston, who led the study.
The range of questions is key for testing intelligence, according to the researchers, because it exposes the weaknesses of AI that has been well-trained, so to speak, in a particular type of reasoning. The test is also distinctive because it doesn't use pre-scripted questions, but rather draws the questions from a basic simulation of a world in which characters walk around and pick up objects. This way, questions are never repeated and AI's can be tested more than once without the risk that they will learn the test answers.
Ultimately, the researchers' goal is not only to test for intelligence, but to develop methods for creating a personal assistant that can truly converse with a human, rather than regurgitating preprogrammed answers like Siri. "Our tasks measure understanding in several ways: whether a system is able to answer questions via chaining facts, simple induction, deduction and many more," the researchers wrote in their paper. "The tasks are designed to be prerequisites for any system that aims to be capable of conversing with a human."
"We believe many existing learning systems can currently not solve them, and hence our aim is to classify these tasks into skill sets, so that researchers can identify (and then rectify) the failings of their systems," they wrote. And indeed, even though the questions are relatively easy, all of the algorithms they used to test this evaluation failed to answer all the questions correctly. The software that came closest was an artificial neural network with an external memory drive, similar to Google's DeepMind.
The fact that the questions are still so primitive is an indicator that we don't need to be afraid of the singularity just yet, and LeCun insists that the existence of this test shouldn't make anyone too nervous at this juncture. "All machines are still very dumb and we are still very much in control. It's not like some company is going to come out with the solution to AI all of a sudden and we're going to have super-intelligent machines running around the internet."