Google researchers published a paper that at once fills my heart with hopeful joy and eternal sadness. Joy because people care and are investing resources into developing intelligent systems. Sadness because of how poorly such systems perform 40+ years after the release of Kubrick’s 2001: A Space Odyssey.
The authors of the paper put together a large neural network that ran on a thousand 16-core machines for three days “learning” from a dataset of 10 million 200×200 pixel images.
The task is to train a face detector without ground truth (labeling images as containing a face or not). This task is absurdly difficult and I would even say just plain absurd. It’s like trying to teach a child algebra by giving him addition problems, but not ever telling him how to do addition or what the right answer is. It’s a fascinating and brave question to ask, because of how counter-intuitive it is.
Not surprisingly, the “breakthrough” that the paper touts is a 15.8% accuracy of classifying the objects in one of 20,000 categories. This is apparently a good improvement over the previous state of the art. My question is, in what universe is 15.8% deserving of a New York Times article? Granted it does exceed the approval rating of Congress, but that’s about it.
I don’t mean to be so dismissive. This is an excellent paper that scratches at the surface of an immense mystery: the gap between the most powerful supercomputer and the most primitive human brain? What’s even more exciting is that Google is funding this research and even more importantly putting its immense computational resources behind it.