We’re always hearing about how artificial intelligence (AI) and automated technologies are going to be replacing our jobs in the near future, and now researchers in the US have developed an AI program capable of doing 11th grade SAT level geometry questions. And it’s not a theoretical claim either – they actually gave the software real questions sourced from the SAT (including questions that the program had not seen before) to see how it fared.
So how did it go? Well, pretty amazingly, considering this has never been done before – but we can also feel smugly confident that AI won’t be replacing human mathematicians any time soon. On practice questions, the system scored a considerable 61 percent, but let itself down when it came to the crunch, managing just 49 percent (ouch, so close!) for official SAT questions.
The findings of the research, presented this week at the Conference on Empirical Methods on Natural Language Processing in Lisbon, show how far AI systems have come, but also indicate that there’s still fair room for improvement. (We imagine the report card for the AI would read something along the lines of the classic: “AI shows promise, but needs to apply itself to get better grades.”)
You can see the kinds of questions the software was tested with online. If you’ve been taught this level of mathematics before it’s not particularly difficult stuff (for a human, that is), but the real challenge for the researchers is teaching software to correctly recognise all the visual information on the page in order to understand what it’s being asked to do.
As one of the researchers, Ali Farhadi of the University of Washington, told John Markoff from The New York Times, visual markings that are easy for even children to understand – for example, an arrow drawn in a test diagram – is not yet something that the most advanced AI can always correctly identify in context.
“A lot of my colleagues have said machine vision is a solved problem,” said Farhadi. “My answer is, ‘Call me when you’ve solved this.’”
But while AI can’t always comprehend questions with accuracy, the fact that it’s getting pass marks or close to pass marks in SAT level mathematics is in itself a pretty stunning achievement. Perhaps not all the time – but nonetheless a lot of the time – the software can combine enough understanding of diagrams, arrows, numbers, shapes and written sentences on a page to correctly identify and answer the geometry questions posed to it.
Next step: Philosophy 101!