Event perception

From Scholarpedia
Jeffrey M. Zacks (2008), Scholarpedia, 3(10):3837. doi:10.4249/scholarpedia.3837 revision #72866 [link to/cite this article]
Jump to: navigation, search

The inputs to hearing, vision, and the other senses are continuous, dynamic, and comprise huge amounts of data each second. However, human conscious experience seems to resolve into a manageable number of more-or-less discrete entities—most prominently ‘’objects’’ and ‘’events.’’ Just as people perceive the world as being made up of objects such as “chairs,” “airplanes,” and “dogs,” people perceive the world as made up of events such as “buying a car” or “cutting a cake” (Barker, 1963). For both objects and events, perception includes ‘’segmenting’’ entities from their surroundings, ‘’recognizing’’ them as individuals or instances of a class, and identifying the features that characterize them. Event perception is the set of cognitive mechanisms by which observers pick out meaningful spatiotemporal wholes from the stream of experience, recognize them, and identify their characteristics.


Types of Events

The term event perception encompasses a range of phenomena involving the processing of temporally extended, dynamic information. In psychology, attention has focused on events that are relatively brief, on the timescale of seconds to minutes, and that are perceived rather than, say, read about (Zacks & Tversky, 2001). Several types of perceptual events have been studied, in part reflecting differences in theoretical orientation amongst groups of researchers.

One class of such events corresponds to physical interactions between objects. Michotte (1946) studied simple interactions in which one object approaches another object, which then itself begins moving or changes its direction of motion. Michotte characterized a range of conditions in which a sequence gives rise to the impression that one object has launched the other into motion. Critical variables include the objects’ proximity when the second object begins moving, the timing of the motion change, and the relative velocities of the objects’ motions.

Researchers in the Gibsonian tradition have characterized events that are defined by changes in the layout of environmental surfaces; these include translations and rotations, collisions, surface deformations (as when a body changes pose), and surface disruptions (as when a tear or hole is created) (Gibson, 1979). The unfolding of such a change has its own invariant structure and makes up a spatiotemporal form (Bingham & Wickelgren, 2008). Researchers in this tradition have studied both inanimate events such as billiard balls colliding and rocks splashing into water, and simple animate actions such as reaching and jumping.

More recently, researchers into the types of events described both by Michotte and Gibson have begun to explore intersensory interactions and their effects on perceptions of events. Inanimate and animate actions both have been shown to be affected by the interactions of, say, visual and auditory cues, very often to resolve information about an event that perceptually is ambiguous within one sense alone. For example, Sekuler showed that interactions between moving objects that otherwise visually were ambiguous were "resolved" to a collision when paired with an auditory cue consistent with a collision or were "resolved" to a pass when paired with an auditory cue consistent with a near miss (Sekuler, Sekuler, & Lau, 1997).

Amongst animate actions, human actions are a particularly important class of perceptual events. Researchers have studied simple actions such as reaching, stepping, and pointing, each of which may constitute an atomic unit for action planning and perception. Researchers also have studied more complex sequential actions, particularly everyday household activities such as washing dishes, cleaning, and preparing meals. The understanding of sequential structure in action appears to be particularly tied to observers’ understanding of actors’ goals (Miller & Johnson-Laird, 1976, Ch. 2) and to semantic knowledge about how events typically unfold in time. One’s semantic knowledge about an everyday activity is known as a script (Abelson, 1981) and there is increasing evidence to suggest that knowing how to complete an action oneself affects one's perceptions of that action in others (see, for example, Calvo-Merino, Glaser, Grezes, Passingham, & Haggard, 2005).

Just as for perceptions of inanimate events, perceptions of animate events also have been shown to be affected by intersensory interactions. These include the ventriloquist illusion (Witkin, Wapner, & Leventhal, 1952) in which the perceived spatial source of a sound is affected by coincident visual changes, and the McGurk Effect (McGurk & MacDonald, 1976), in which the perception of phonemes is affected by concurrent visual information about the speaker’s facial movements.

Many human actions involve multiple participants interacting; that is, they are social. Thus, many scripts include representations of actors’ roles and relations. Social psychologists have studied social events extensively (e.g., Ebbesen, 1980; Wyer & Radvansky, 1999). Prior knowledge in the form of scripts (for events) and stereotypes (for people) has a major influence on observers’ perception of ongoing events. Other important variables are observers’ current goals, the events immediately preceding a current event, and observers’ future plans.

Event Segmentation

Event segmentation is a form of categorical perception in which intervals of time are picked out as units and distinguished from other intervals. As such, it is a mechanism of Gestalt grouping: The ongoing stream of activity is parsed into meaningful wholes (see Gestalt principles). However, most research on Gestalt grouping has focused on the grouping of visual objects. Event segmentation has been studied using explicit behavior measures and noninvasive neuroimaging measures (for overviews, see Kurby & Zacks, 2008; Zacks & Swallow, 2007). To measure event segmentation behaviorally, experimenters have asked people to watch movies and press a button whenever one event ends and another begins (Newtson, 1973). The instructions may be amended to ask participants to, for example, identify fine-grained or coarse-grained units of activity. The following animation shows an excerpt of a movie that was segmented by 16 observers at coarse and fine temporal grains, with the points identified as event boundaries marked with green lines.


(copyright 2008 Jeffrey M. Zacks.)

This task has been used with movies of everyday household and office activities, with professionally-produced cinema, and with simple animations. It also has been used with spoken narratives, written narratives, and music. For all these stimuli, observers show good agreement about the locations of event boundaries. Fine-grained units tend to cluster hierarchically into coarse-grained units. Segmentation is related to later memory: Moments that are identified as event boundaries tend to be better recognized later, and participants who segment better perform better on subsequent memory tests.

Noninvasive neuroimaging, particularly functional magnetic resonance imaging, has been used to study event segmentation as a part of ongoing cognitive activity. Brain activity in posterior and frontal cortex has been found to transiently increase at points in time corresponding to behaviorally identified event boundaries (Zacks et al., 2001). Brain activity at event boundaries tracks perceptual and conceptual features that are related to event segmentation, including motion, changes in spatial location, changes in characters and objects, and transitions between movements in musical works. Such results indicate that neural processes correlated with event segmentation are part and parcel of ongoing perception, and as such do not depend on observers performing an intentional segmentation task.

Event Recognition

In addition to picking out individual events from the behavior stream, observers also recognize events as belonging to classes. Event recognition is closely tied to event segmentation; this parallels the cases for speech recognition and object recognition (see also visual object recognition). Individuating events helps to identify them, and identifying events helps to individuate them. For events defined by human actions, event recognition may answer a more specific set of questions than simply “What is happening?” These include:

  • What action is being performed?
  • Who is the actor performing an action?
  • What is the goal of this action?

Research from the Michotte and Gibson traditions has shown that many categories of events can be identified based on motion patterns. This holds for events defined by more complex human actions as well (for a review, see Blake & Shiffrar, 2007). One method for studying the role of motion analysis in action identification is to present viewers with animations of points on a person’s body while that person undertakes an action. The following animation gives an example; can you identify the action?


(copyright 2008 Thomas F. Shipley; reproduced with permission.)

These point-light displays have been used to show that viewers can easily identify many actions based on biological motion, under conditions in which the static configuration of the points provides little information about the action (Johansson, 1973). Viewers not only can use motion cues to identify what the action is, but also to recognize actors’ genders, to recognize friends, and to recognize mood states (Pollick & Patterson, 2008; Troje, 2008). Action recognition from biological motion is disrupted by inversion (Shipley, 2003), and is associated with selective activation of regions in the lateral temporal cortex (Grossman et al., 2000; Beauchamp, Lee, Haxby, & Martin, 2003). As with other types of events, biological motion perceptions also are affected by intersensory interactions (Brooks, Petreska, Billard, Spierer, Clark, Blanke, & van der Zwan, 2007). These findings suggest that biological motion analysis depends on specialized computational mechanisms, and that the perception of biological events is mediated by information from across the senses.

Humans are not just observers of action; they are also actors. The common coding view of event perception holds that common representations underlie the perception of others’ actions and the planning of one’s own actions (Hommel, Muesseler, Aschersleben, & Prinz, 2001; Rizzolatti, Fogassi, & Gallese, 2001; Prinz, 1997). This view receives support from ideomotor compatibility effects, in which performing an action is facilitated by observing a congruent event and perceiving an event is facilitated by performing a congruent action. Conversely, performing an action while observing a different action reduces performance on the action. More recently, this view has also received support from the observation in the monkey of cells that are highly selective for particular actions, and show similar selectivity during action and perception (see mirror neurons).

Not all events correspond to the intentional actions of actors. People perform actions unintentionally, such as dropping a plate or slipping and falling. Perceptual events also encompass inanimate phenomena such as landslides and rainshowers. An important question is the degree to which the understanding of such inanimate events utilizes the same processing routines as are used to understand intentional activity. For example, when are specialized mechanisms such as the biological motion system or the mirror system applied to other events that share some characteristics with animate actions?

Perceptual Events in Relation to Other Sorts of Events

How does event perception as discussed here relate to the understanding of other sorts of events? The everyday-language term “event” encompasses a wider range than the events discussed here, including things that happen on very short or very long timescales, such as interactions between subatomic particles or the orbit of Saturn around the sun. Clearly, the perceptual mechanisms described here do not apply to events that happen too quickly or take too long; however, people may understand such phenomena by analogy to perceptual events, performing mental simulations in which time is slowed down or sped up. People also may perform such simulations when thinking about general classes of events or situations, as when making actuarial predictions or logical inferences (Johnson-Laird, 1989). Thus, the mechanisms by which people process perceptual events may form the core cognitive structures by which they think about a broader range of events and situations.


Abelson, R. P. (1981). Psychological status of the script concept. American Psychologist, 36, 715-729.

Barker, R. G. (1963). The stream of behavior as an empirical problem. In R. G. Barker (Ed.), The stream of behavior. (pp. 1-22). New York: Appleton-Century-Crofts.

Beauchamp, M. S., Lee, K. E., Haxby, J. V., & Martin, A. (2003). fMRI responses to video and point-light displays of moving humans and manipulable objects. Journal of Cognitive Neuroscience, 15, 991-1001.

Bingham, G. P. & Wickelgren, E. A. (2008). Events and actions as dynamically molded spatiotemporal objects: A critique of the motor theory of biological motion perception. Understanding events: From perception to action, 1, 255-286(32).

Blake, R. & Shiffrar, M. (2007). Perception of human motion. Annual Review of Psychology, 58, 47-73- EF.

Brooks, A., Petreska, B., Billard, A., Spierer, L., Clark, S., Blanke, O., & van der Zwan, R. (2007). Ears, eyes and bodies: audiovisual processing of biological motion cues. Neuropsychologia, 45, 523-530.

Brooks, A., Schouten, B., Troje, N.F., Verfaillie, K., Blanke, O., & van der Zwan, R. (2008). Correlated changes in perceptions of the gender and the orientation of ambiguous biological motion figures. Current Biology, 18, R728-R729.

Calvo-Merino, B., Glaser, D.E., Grezes, J., Passingham, R.E., & Haggard, P. (2005). Action observation and acquired motor skills: an fMRI study with expert dancers. Cerebral Cortex, 15, 1243-1249.

Ebbesen, E. B. (1980). Cognitive processes in understanding ongoing behavior. In R. Hastie (Ed.), Person memory: the cognitive basis of social perception. (pp. 179-225). Hillsdale, NJ: Lawrence Erlbaum Associates.

Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin.

Grossman, E. D., Donnelly, M., Price, R., Pickens, D., Morgan, V., Neighbor, G. et al. (2000). Brain areas involved in perception of biological motion. Journal of Cognitive Neuroscience, 12(5), 711-720.

Hommel, B., Muesseler, J., Aschersleben, G., & Prinz, W. (2001). The Theory of Event Coding (TEC): A framework for perception and action planning. Behavioral & Brain Sciences, 24(5), 849-937.

Johansson, G. (1973). Visual perception of biological motion and a model for its analysis. Perception and Psychophysics, 14, 201-211.

Johnson-Laird, P. N. (1989). Mental Models. In M. I. Posner (Ed.), Foundations of cognitive science. (pp. 469-500). Cambridge, MA: MIT Press.

Kurby, C. A. & Zacks, J. M. (2008). Segmentation in the perception and memory of events. Trends in Cognitive Sciences, 12(2), 72-79.

McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 263, 747-748.

Michotte, A. E. (1946). The perception of causality (T. R. Miles & E. Miles, Trans.). New York: Basic Books.

Miller, G. A. & Johnson-Laird, P. N. (1976). Language and perception. Cambridge, MA: Harvard University Press.

Newtson, D. (1973). Attribution and the unit of perception of ongoing behavior. Journal of Personality and Social Psychology, 28(1), 28-38.

Pollick, F. E. & Patterson, H. (2008). Movement style, movement features, and the recognition of affect from human movement. In T. F. Shipley & J. M. Zacks (Eds.), Understanding events: From perception to action. (pp. 286-307). New York: Oxford University Press.

Prinz, W. (1997). Perception and action planning. European Journal of Cognitive Psychology, 9(2), 129-154.

Rizzolatti, G., Fogassi, G., & Gallese, V. (2001). Neurophysiological mechanisms underlying the understanding and imitation of action. Nature Reviews Neuroscience, 2, 661-670.

Sekuler, R., Sekuler, A.B., & Lau, R. (1997). Sound alters visual motion perception. Nature, 385, 308.

Shipley, T. F. (2003). The effect of object and event orientation on perception of biological motion. Psychological Science, 14(4), 377-380.

Troje, N. (2008). Retrieving information from human movement patterns. In T. F. Shipley & J. M. Zacks (Eds.), Understanding events: From perception to action. (pp. 308-334). New York: Oxford University Press.

Wyer, R. S., Jr. & Radvansky, G. A. (1999). The comprehension and validation of social information. Psychol Rev, 106(1), 89-118.

Witkin, H.A., Wapner, S., & Leventhal, T. (1952). Sound localisation with conflicting visual and auditory cues. Journal of Experimental Psychology, 43, 58-67.

Zacks, J. M., Braver, T. S., Sheridan, M. A., Donaldson, D. I., Snyder, A. Z., Ollinger, J. M. et al. (2001). Human brain activity time-locked to perceptual event boundaries. Nature Neuroscience, 4(6), 651-655.

Zacks, J. M. & Swallow, K. M. (2007). Event segmentation. Current Directions in Psychological Science, 16, 80-84(5).

Zacks, J. M. & Tversky, B. (2001). Event structure in perception and conception. Psychological Bulletin, 127(1), 3-21.

Internal references

  • Giacomo Rizzolatti and Maddalena Fabbri Destro (2008) Mirror neurons. Scholarpedia, 3(1):2055.

Recommended Reading

Gibson, J.J. (1979). The Ecological Approach to Visual Perception. Boston: Houghton

Michotte, A.E. (1946) The perception of causality. Basic Books, New York.

Miller, G.A. & Johnson-Laird, P.N. (1976) Language and perception. Harvard University Press, Cambridge, MA.

Shipley, T.F. & Zacks, J.M. (Eds.) (2008) Understanding events: From perception to action Oxford University Press, New York.

External Links

Author’s web site

Point-light archive (maintained by Thomas F. Shipley)

Point-light demonstrations (maintained by Nikolaus F. Troje)

Personal tools

Focal areas