Researchers have designed a single-photon time-of-flight LiDAR system that can acquire a high-resolution 3D image of an object or scene up to 1 kilometer away. The new system could help enhance security, monitoring, and remote sensing by enabling detailed imaging even in challenging environmental conditions or when objects are obscured by foliage or camouflage netting.
If humans can do it why not cameras?
Can humans actually do it, though? Are humans actually capable of driving a car reasonably well using only visual data, or are we actually using an entire suite of sensors in our heads and bodies to understand our speed and orientation, road conditions, and our surroundings? Driving a car by video link is considerably harder than just driving a car normally, from within a car.
And even so, computers have a long way to go before they catch up with our visual processing. Our visual cortex does a lot of error correction of visual data, using proprioceptive sensors in our heads that silently and seamlessly delete the visual smudges and smears of motion as our heads move. The error correction adjusts quickly to recalibrate things when looking at stuff under water or anything with a different refractive index, or when looking at reflections in a mirror.
And we maintain that flow of visual data by correcting for motion and stabilizing the movement of our eyes to compensate from external motion. Maybe not as good as chickens, but we’re pretty good at it. We recognize faulty sensor data and correct for it by moving our heads around obstructions, of silently ignoring something that is just blocking one eye, of blinking or rubbing our eyes when tears or water make it hard to focus. We also know when to not trust our eyes (in the dark, in fog, when temporarily blinded by lights), and fall back to other methods of understand the world around us.
Throw in our sense of balance in our inner ears, our ability to direction find on sounds, and the ability to process vibrations in our seat and tactile feedback on a steering wheel, the proprioception of feeling forces on our body or specific limbs, and we have an entire system that uses much more than visual data to make decisions and model the world around us.
There’s no reason why an artificial system needs to use exactly the same type of sensors as humans or other mammals do. And we have preexisting models and memories of what is or was around us, like when we walk around our own homes in the dark. But my point is that we rely on much more than our eyes, processed through an image processing system far more complex than the current state of AI vision. Why hold back on using as much sensor data as possible, to build a system that has good, reliable sensor data of what is on the road?
I think I’m following you. So if we added LiDAR, thermal sensors, and a couple of chickens to the car we’d be able drive the vehicle ourselves, optimally.
Reason 1: humans can blink – a dirty camera can not.