Demystifying the richness of visual perception
Ruth Rosenholtz (MIT)

back to Overview

Date: Wednesday, 20.10.21 15:20-17:00 CET

Signup: If you would like to attend the talks please register here to get a Zoom link.


Human vision is full of puzzles. Observers can grasp the essence of a scene in an instant, yet when probed for details they are at a loss. People have trouble finding their keys, yet they may be quite visible once found. How does one explain this combination of marvelous successes with quirky failures? I will describe our attempts to develop a unifying theory that brings a satisfying order to multiple phenomena.

One key is to understand peripheral vision. A visual system cannot process everything with full fidelity, and therefore must lose some information. Peripheral vision must condense a mass of information into a succinct representation that nonetheless carries the information needed for vision at a glance. We have proposed that the visual system deals with limited capacity in part by representing its input in terms of a rich set of local image statistics, where the local regions grow — and the representation becomes less precise — with distance from fixation. This scheme trades off computation of sophisticated image features at the expense of spatial localization of those features.

What are the implications of such an encoding scheme? Critical to our understanding has been the use of methodologies for visualizing the equivalence classes of the model. These visualizations allow one to quickly see that many of the puzzles of human vision may arise from a single encoding mechanism. They have suggested new experiments and predicted unexpected phenomena. Furthermore, visualization of the equivalence classes has facilitated the generation of testable model predictions, allowing us to study the effects of this relatively low-level encoding on a wide range of higher-level tasks.

Peripheral vision helps explain many of the puzzles of vision, but some remain. By examining the phenomena that cannot be explained by peripheral vision, we gain insight into the nature of additional capacity limits in vision. In particular, I will suggest that decision processes face general-purpose limits on the complexity of the tasks they can perform at a given time.