Gaze-informed multimodal interaction

Abstract

Observe at a person pointing out and describing something. Where is that person looking? Chances are good that this person also looks at what she is talking about and pointing at. Gaze is naturally coordinated with our speech and hand movements. By utilizing this tendency, we can create a natural interaction with computing devices and environments. In this chapter, we will first briefly discuss some basic properties of the gaze signal we can get from eye trackers, followed by a review of a multimodal system utilizing the gaze signal as one input modality. In Multimodal Gaze Interaction, data from eye trackers is used as an active input mode where for instance gaze is used as an alternative, or complimentary, pointing modality along with other input modalities. Using gaze as an active or explicit input method is challenging for several reasons. One of them being that eyes are primarily used for perceiving our environment, so knowing when a person selects an item with gaze versus just looking around is an issue. Researchers have tried to solve this by combining gaze with various input methods, such as manual pointing, speech, touch, etc. However, gaze information can also be used in interactive systems, for other purposes than explicit pointing since a user’s gaze is a good indication of the user’s attention. In passive gaze interaction, the gaze is not used as the primary input method, but as a supporting input method. In these kinds of systems, gaze is mainly used for inferring and reasoning about the user’s cognitive state or activities in a way that can support the interaction. These kinds of multimodal systems often combine gaze with a multitude of input modalities.

In this chapter we focus on interactive systems, exploring the design space for gaze-informed multimodal interaction spanning from gaze as active input mode to passive and if the usage scenario is stationary (at e.g. a desk) or mobile. There are a number of studies aimed at describing, detecting or modeling specific behaviors or cognitive states. We will touch on some of these works since they can guide us in how to build gaze-informed multimodal interaction.