Turing Test Alternative Lovelace 2.0 Tests AI for Original Thought
The Turing test may soon be a thing of the past, as the newly standardized Lovelace 2.0 test, which will be officially proposed in January, may serve as the new barometer for truly "intelligent" artificial intelligence.
When the Turing Test was passed by chatbot Eugene Goostman for the first time in June, it should have been a triumphant day in the field of artificial intelligence. But according to many researchers, the fact that Eugene was a relatively simple chatbot only revealed the deficiencies of the Turing test as a standard for artificial intelligence. The infamous test, created by mathematician and AI researcher Alan Turing, asks whether an AI can pass as a human for a given amount of time. Turing only intended for it to be a guideline, rather than a definitive test for artificial intelligence, as it is based primarily in deception and, as demonstrated by Eugene, can easily be manipulated by trickery.
Computers With 'Original Thoughts'
This led several researchers to create an alternative to the Turing test, called the Lovelace test (named for Ada Lovelace, the first computer programmer). Where the Turing test only requires a machine to pass as a human through trickery, the Lovelace test aims to determine the defining characteristics of human thought in order to judge whether an artificial intelligence shares these characteristics. A machine can only pass the Lovelace test if it creates a program that it was not designed to create, or in other words, if it has an original thought. They formalized the concept of originality or "surprise" as a machine creating a certain output that was not the result of mechanical error or fluke (and therefore can be repeated), and that the designer cannot explain how the machine created that output.
Lovelace 2.0 - Eliminating the Element of Surprise
But Georgia Tech professor Mark Reidl asserts that this test is inadequate in the sense that it does not provide any specific criteria for the output the machine needs to create. In that sense, the Lovelace 2.0 test is much more formalized, as it requires the AI to do the following:
1) Create an artifact of a certain type (Reidl cites painting, poetry, architectural designs, and stories as examples)
2) The artifact must conform to a certain set of constraints put forth by a human evaluator
3) The human evaluator must decide that the output is a valid instance of that type of artifact
4) A human referee must determine whether the type of artifact with the proposed constraints are realistic for a human to create
Reidl contends that his test avoids the subjective criterion of "surprise," and allows for more objectivity than the original Lovelace test, as the human evaluator is only judging whether the AI was in keeping with the set constraints, rather than making any kind of value judgment. It seems dubious that any human evaluator can avoid subjectivity completely, but at least Reidl's test does seem to have more specific parameters that will lend itself to comparative objectivity.
From the paper: "Creativity is not unique to human intelligence, but it is one of the hallmarks of human intelligence. Many forms of creativity necessitate intelligence. In the spirit of The Imitation Game, the Lovelace 2.0 Test asks that artificial agents comprehend instruction and create at the amateur levels."