Tuesday, April 6, 2010

Getting closer to AI vision

I've been admiring the book "Eye, Brain, and Vision" by David H. Hubel (Scientific American Library series no. 22, 1988) for a while now. The illustrations are very good, and the discussion of what is (was) known and unknown is well written and complete. Not too much of the fundamental understanding of vision has changed since then.

[1/14/2011 This book, with supplementary material, is available online: David Hubel's "Eye, Brain, and Vision"]

Hubel (with Weisel) has deeply influenced neuroscience, particularly visual and cognitive neuroscience, for half a century now. This has coincided with the development of electronic computers, and the interaction of ideas has been fruitful.

After admiring video of a general purpose robot with a vision system informing a towel folding program, this section in the "Present and Future" chapter (ch. 10, page 220) caught my eye:
    This is where we are, in 1987, in the step-by-step analysis of the visual path. In terms of numbers of synapses (perhaps eight or ten) and complexity of transformations, it may seem a long way from the rods and cones in the retina to areas MT or visual area 2 in the cortex, but it is surely a far longer way from such processes as orientation tuning, end-stopping, disparity tuning, or color opponency to the recognition of any of the shapes that we perceive in our everyday life. We are far from understanding the perception of objects, even such comparatively simple ones as a circle, a triangle, or the letter A--indeed, we are far from even being able to come up with plausible hypotheses.
    We should not be particularly surprised or disconcerted over our relative ignorance in the face of such mysteries. Those who work in the field of artificial intelligence (AI) cannot design a machine that begins to rival the brain at carrying out such special tasks as processing the written work, driving a car along a road, or distinguishing faces. They have, however, shown that the theoretical difficulties in accomplishing any of these tasks are formidable. It is not that the difficulties cannot be solved--the brain clearly has solved them--but rather that the methods the brain applies cannot be simple: in the lingo of AI, the problems are "nontrivial". So the brain solves nontrivial problems.

The understanding of lower (neuronal) level sensory processing in animals has served as inspiration for many image processing techniques and algorithms, including scale space techniques and feature detection/description tools like SIFT.

No comments:

Post a Comment