Top-Performing AI Still Can't Pass an 8th Grade Science Test

Tuesday, 16 February 2016 - 2:06PM
Artificial Intelligence
Tuesday, 16 February 2016 - 2:06PM
Top-Performing AI Still Can't Pass an 8th Grade Science Test
We may not have to worry about a hostile Skynet takeover just yet. According to the results of a recent contest held by the Allen Institute for Artificial Intelligence, the most advanced AI out there right now still can't pass a test designed for a 13-year-old.

The contest invited over 800 teams of AI researchers to submit their most sophisticated program with the goal of passing an eighth grade science test. The results were released today, and the top-performing programs were only able to answer approximately 60% of the questions correctly. In other words, everyone failed (or at best, received a soft D minus, depending on your middle school's policy). 

AI is undoubtedly becoming more and more advanced, and has become especially proficient in specific tasks like chess, with Google researchers even achieving the "holy grail" of teaching a program to play the game Go better than a human champion. But according to the researchers who ran the contest, the task passing a relatively basic science test exposes the weaknesses that are still inherent to AI's "thinking," especially when it comes to language processing.
Opening quote
"Natural language processing, reasoning, picking up a science textbook and understanding-this presents a host of more difficult challenges," Oren Etzioni, a professor of computer science at the University of Washington and the executive director of the Allen Institute, told Wired. "To get these questions right requires a lot more reasoning."
Closing quote

As it stands now, we are still a long way from creating an AI that is able to complete comparable cognitive tasks to a human, and an even longer way from sentient machines that could someday decide to take over the world. That being said, this contest may not be 100% accurate in reflecting the state of AI research, especially since the heavy hitters like Google, Facebook, and IBM did not participate in the contest. 

Opening quote
"It's entirely possible that the scores would have gone higher had companies like Google and others put their 'big guns' to work," said Etzioni. "[But] the 'wisdom of the crowds' is quite powerful and there some very talented folks engaged in these contests."
Closing quote

And even if one of the robots were able to pass the test, that wouldn't necessarily mean that the program is particularly intelligent. Similar to the bot that "passed the Turing test," it might just be tailored to pass that specific test, but not even close to achieving humanlike intelligence overall. Increasingly, it seems that no one test will be sufficient to measure a "true" artificial intelligence, but rather, a true AI would be able to pass a whole battery of tests by virtue of its intelligence. 

Opening quote
"If you're talking about passing multiple choice science tests, I always felt that was not actually the test AI should be aiming to pass," said AI researcher Doug Lenat. "The focus on natural language understanding--science tests, and so on-is something that should follow from a program being actually intelligent. Otherwise, you end up hitting the target but producing the veneer of understanding."
Closing quote
Science
Technology
Artificial Intelligence

Load Comments