In the 70′s, we thought it would be easy to create machines that could see. We were wrong. But today, we’re on the cusp of something exciting.
If you can define the vision problem precisely, odds are, we can build a machine that rivals or exceeds human ability: We can build machines which are better at recognizing faces than people. We’re wired to recognize a few hundred or a few thousand faces, but security software can scan for one in a million. It’s not just for security, anymore. We can do this across the web, and recently, in our own photo sets.
In swimming pools, Lifeguards aren’t always vigilant, but increasingly, computer vision systems are.
We’re getting better at taking large collections of photographs and recreating full 3D (or 4D) scenes. Photo tourism is already changing the way we review large collections of photos in popular areas.
We still suck at building vision software that can perform general object recognition as well as humans. But some groups are working on that. I don’t think it will take long before these systems rival human ability for any visual task that you can perform in under a second.
The most exciting thing is that the game doesn’t stop when we match human ability across a broad spectrum of tasks. Instead, it gets more interesting. Today, we can’t see through walls, and we can’t recognize everyone in a crowd. We can’t jump three-hundred feet in the air to get a birds-eye view. We can’t recognize every species of plant and animal. We can’t read text in more than a handful of languages. We can’t see beyond the human visual spectrum. You get the idea. It’s the dawn of an exciting time.