Skip to main content


Eyes rotate as minds spin: Inferring the capacity of memory during complex visual search


Melissa M. Kibbe from Rutgers University, supported by the National Science Foundation’s Interdisciplinary Graduate Education and Research Training program, has made novel discoveries showing that people know how to balance their reliance on memory and motor exploration during visual search. Kibbe’s work addresses a long-standing problem in human behavior and neuroscience: How do human beings cope with the inevitable and unavoidable limits in memory and perception? There is, after all, only so much we can remember at once from any visual scene before us. We can choose to shift our gaze to re-examine a forgotten detail, but every shift of gaze takes planning and takes time. Choices about whether to shift gaze or to rely on memory are made as often as one to three times per second with barely any awareness or conscious effort. Yet these choices underlie our ability to read, search the environment, drive, or carry out any number of fast-paced visual-motor tasks.

Models of how such choices are made can shed light on how the brain supports crucial cognitive functions in everyday life. Kibbe used her interdisciplinary training in perceptual science and computational modeling to design a novel visual search task in which the cognitive and motor difficulty could be manipulated systematically, and memory use could be inferred by examining people’s eye movements. She then applied a probabilistic computational model of memory to estimate the number of objects held in memory as search was carried out.

Kibbe, working along with Rutgers Prof Eileen Kowler, devised a difficult search task requiring people to look through arrays of nine multi-featured objects to find a set of three belonging to the same category (a sample display is on the left in Figure 1). The rules defining the category varied in difficulty, ranging from simple (“find 3 objects that share the same feature, such as color or shape”) to complex (“find 3 objects that share one feature, but differ on two other features”). To increase the reliance on memory, Kibbe linked the appearance of the displays to an eye movement recording device so that the contents of a given location appeared only while the observer looked directly at it (see the right side of Figure 1). Kibbe used the eye movement patterns to infer how memory was used during the search task.

She reasoned that as the complexity of the category rule increased, the ability to remember the contents of a previously-viewed location would diminish, requiring that observers examine more locations before finding the set of three that satisfied the category rule. Figure 2 shows that this expectation was confirmed: the harder the rule, the more locations were searched. The observers’ decisions, however, were also influenced by how long it took to reveal the contents of a location. When a brief (< 1 second) delay was imposed between the arrival of the line of sight at a location and the appearance of the contents, the number of locations searched fell by about a factor of two, showing that people were sensitive to the increasing costs associated with the imposed delays, and decided to rely more on memory and less on moving the eye.

In the final stage of the project, Kibbe and Kowler applied a computational model to the eye movement patterns to determine the momentary capacity of memory for objects in the display. This analysis was based on the number of locations visited before a given location was revisited (Figure 3). A finite-state Markov model, developed by J. Epelboim and P. Suppes in 2001 for analyses of eye movements while solving problems in geometry, was fit to the pattern of revisits and used to infer the capacity of short-term memory during search. The model assumes that objects are added to short-term memory until memory is full, at which point one object has to be bumped from memory in order to make room for a new one.

The model fit the search data best when short-term memory was set to hold about 4 or 5 objects. This estimate is remarkably similar to the estimates obtained for problem-solving and other cognitive tasks. Estimated memory capacity was smaller in the case where delays were imposed before the contents of the locations were revealed (Fig. 2, bottom), reflecting an adaptive response to the increasing risk of memory decay over the time during the delay. Memory was also sensitive to the complexity of the category rule. When rules required search for objects that differed along a dimension, the estimated memory capacity increased by about one object. The increase is consistent with a strategy of remembering fewer features from each object, suggesting that abstract rules were being translated into cognitively-tractable concrete examples, for example, “find a red, green and blue square” rather than “find 3 objects that differ on a feature”.

These results show that people exist within a cognitive whirlwind, constantly choosing where to look, what to remember, what to attend, and what to selectively forget, all in the interest of solving perceptual and cognitive problems in the least amount of time. The rules governing our decisions must be so ingrained in our cognitive machinery that they are carried out quickly and efficiently, with only scant awareness or overt conscious control. This is what makes us marvel at the success of the human cognitive, perceptual and motor systems in making best use of limited capacities. When the intelligent machines we build in an attempt to mimic or surpass human performance eventually reach limits in their capacity to remember information or to solve problems, we send our engineers to build new machines with more capabilities. Humans do not have that option. We must be at our most clever at using the capacities we have. Study of cognitive functions and motor behavior during visual search can provide a window into how we manage to get the job done.

Address Goals

The highlighted research makes great strides in meeting the strategic goals of the National Science Foundation. The research presents a novel and innovative approach to the study of human cognition, integrating behavioral methods and computational modeling to study how cognitive systems such as memory and perception work together efficiently to perform everyday tasks. This approach allows us to study cognitive functioning in a more naturalistic way while still maintaining the tight controls of laboratory research.

The highlighted research represents the cutting edge in experimental technique. We integrated advanced eye-tracking technology with a Boolean approach to quantifying the complexity of the task itself to systematically push the limits of human cognition. We then followed up the experimental approach with a progressive probabilistic modeling technique that allowed us to measure the amount of memory our subjects were using while doing the task. Our integrative method allows us to come at the problem of human cognition from every angle, and study it thoroughly, without compromising ecological validity.

This innovation, made possible by the integrative training provided by the Rutgers Perceptual Science Training Program funded by an NSF IGERT grant, is a great advancement for the United States as a leader in research science. Traditionally, only one of the approaches we used in our highlighted research would be used at a time, essentially acting as several disparate peepholes into cognitive functioning. Integrating methods into a single task creates a window through which we can view the human brain more clearly. In addition to giving us a much better understanding of how the brain functions, we can apply this knowledge to other areas. Engineers of Artificial Intelligence (AI) systems look to human cognition as a model. Our research has found the humans with limitations on their cognitive systems can overcome these limitations by using what they have efficiently. Studying how the human brain trades off limited resources across systems can give insight into how to improve AI. The research can also be applied to children or special populations in order to understand function in individuals where the limits are greater and develop improved tools for education and training.